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CHAPTER  1 


Executive  Summary 

Microwave  receivers  play  a  vital  role  in  Electronic  Warfare  (EW)  environments  for  passive  identification  and 
localization  of  unknown  targets  emitting  high-frequency  electro-magnetic  signals.  These  Receivers  process  signals 
received  by  Microwave  band  radars  and  majority  of  these  receivers  utilize  analog  signal  processing  tools  and 
techniques.  Microwave  signals  have  very  high  frequency  content  and  have  wide  bandwidths.  As  of  now,  there  are 
no  EW  receivers  that  process  microwave  radar  signals  entirely  in  the  digital  domain.  It  is  expected  that,  with 
the  emergence  of  increasingly  faster  and  inexpensive  digital  computers  and  high-speed  A/D  converters,  digital 
processing  of  microwave  signals  would  most  certainly  be  the  way  of  the  future.  One  of  the  main  purpose  of  this 
project  had  been  to  complement  the  research  on  Digital  Microwave  Receiver  Design  being  conducted  at  the  EW 
Laboratory  at  WPAFB,  Dayton,  Ohio, 

In  addition  to  the  digital  receiver  design  problem,  some  fundamental  theoretical  aspects  of  several  classical 
System  Identification  problems  as  well  as  high-speed  implementation  of  various  Signal  Processing  algorithms 
have  also  been  addressed  as  part  of  this  project.  In  particular,  a  unified  framework  has  been  developed  for 
optimal  estimation  of  rational  transfer  function  parameters  from  prescribed  Time-Domain  or  Frequency-Domain 
specifications.  This  powerful  unifying  theoretical  framework  for  System  Identification  appear  to  have  remained 
mostly  unrecognized  and  un-utilized.  Apart  from  the  digital  EW  receiver  design  problem,  the  proposed  theoretical 
foundation  is  expected  to  have  a  broad  range  of  applications  in  rational  modeling. 

High-speed  implementation  of  digital  signal  processing  algorithms  on  Multiprocessor  Architecture  is  an  im¬ 
portant  topic  of  current  interest  in  recent  Signal  Processing  literature.  In  EW  applications  as  well  as  in  other 
hardware  implementations  of  digital  systems,  high-speed  architecture  would  no  doubt  play  an  important  role. 
In  view  of  this,  high-speed  hardware  implementation  of  a  few  important  signal  processing  algorithms  have  been 
addressed  as  the  final  part  of  this  project. 

As  noted  above,  significant  progress  has  been  made  during  the  course  of  this  research  project.  Several 
problems  of  current  interest  have  been  addressed  and  solved  satisfactorily.  Most  of  the  new  results  were  proposed  in 
the  original  proposal  although  some  intermediate  work  had  been  undertaken  as  the  needs  arose  at  the  Wright  Labs. 
This  Final  Technical  Report  contains  the  details  of  all  the  results  of  the  research  that  have  been  accomplished 
over  the  entire  period  covered  by  the  project.  It  may  be  noted  that  some  of  the  results  presented  in  this  report 
were  initiated  as  part  of  the  work  on  the  original  proposal  (Grant  No.  AFOSR-F49620-93-1-0014). 

The  importance  of  any  research  may  perhaps  be  best  judged  by  the  quality  of  publications  it  gen¬ 
erates.  Consequently,  a  significant  amount  of  time  has  been  devoted  on  preparing  Journal  and  Confer¬ 
ence  articles  in  order  to  report  the  findings  of  this  project.  Most  of  the  results  contained  in  this  report 
have  been  published/accepted/presented  in  internationally  recognized  and  top  quality  Signal  Processing  Jour- 
nais/Conferences  although  some  recent  results  are  currently  under  review/preparation  for  future  publication. 
The  papers/publications  ensuing  from  this  research  are  listed  at  the  end  of  this  introductory  Chapter.  Copies  of 
the  papers  and  publications  can  be  made  available  to  the  Program  Monitor,  if  desired. 

The  research  conducted  under  this  project  can  be  categorized  primarily  into  two  broad  themes,  viz,, 

(i)  Digital  EW  Receiver  Design  Problems  :  The  problems  addressed  are  as  follows  : 

(a)  A  high-resolution  method  for  AOA  estimation  using  Minimum-Norm  Method  that  does  not  rely  on  any 
Eigendecomposition 
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(b)  Statistical  Perturbation  Analysis  of  the  DFT-based  Minimum-Norm  Method  proposed  in  part-a. 

(c)  A  high-resolution  Maximum-Likelihood  method  for  frequency  estimation  that  guarantees  unit-circle 
roots 

(d)  Two  methods  for  superior  estimation  of  AR  and  ARMA  parameters  when  the  observation  data  is  noisy 

(e)  Time-Domain  algorithms  for  detection  of  Electronic  Warfare  Signals  in  the  presence  of  Noise 

(f)  Pipelined-Adaptive  Tracking  of  Multiple  Sinusoidal  Frequencies. 

(ii)  System  Identification  and  Hardware  Implementation  Problems  : 

(a)  Optimal  identification  of  1-D  Rational  Systems  from  Input-Output  Data 

(b)  Optimal  identification  of  1-D  Rational  Systems  in  the  Frequency  Domain 

(c)  Optimal  Identification  of  All-Pole  Rational  Systems  in  Time-Domain 

(d)  Design  of  Denominator  Separable  2-D  HR  Filters  from  Spatial-Impulse  Response  Data 

(e)  Design  of  Denominator  Separable  2-D  HR  Filters  from  2-D  Frequency  Response  Data 

(f)  Design  of  2-D  HR  Filters  with  non-separable  denominator  from  Spatial-Impulse  Response  Data 

(g)  High-speed  pipelined  implementation  of  1-D  Recursive  Filters  based  on  a  new  Distributed  Look-Ahead 
scheme. 

(h)  Optimal  Estimation  of  LA  filters. 

(i)  High-speed  pipelined  implementation  of  2-D  Recursive  Filters  based  on  the  Distributed  Look-Ahead 
scheme  proposed  in  part-g. 

The  report  is  organized  as  follows:  In  Chapter  2,  the  research  results  on  Digital  EW  receiver  design  related 
problems  are  reported  whereas  in  Chapter  3,  the  System  Identification  and  Hardware  Implementation  areas  are 
covered  with  complete  details.  Individual  Chapters  are  divided  into  several  Sections  by  topics.  In  the  following 
paragraphs  the  main  results  obtained  in  these  each  of  these  sections  are  summarized  briefly. 

CHAPTER  2.  The  DIGITAL  MICROWAVE  RECEIVER  DESIGN  PROBLEM 

Section  -  2.1  ;  Superresolution  without  Eigendecomposition  :  Method  and  Perturbation  Analysis  : 
Many  existing  high-resolution  methods,  such  as  MUSIC  or  Minimum-Norm  Method,  rely  on  special-purpose 
hardware  or  software  for  obtaining  the  signal  and  noise  subspace  eigenvectors  of  Autocorrelation  (AC)  matrices. 
In  this  project,  we  have  developed  a  new  DFT-based  high-resolution  frequency  estimation  algorithm  which  does 
not  require  any  eigendecomposition  and  hence,  it  is  much  less  computation  intensive.  It  has  been  demonstrated 
that  the  DFT  of  the  AC  matrix  (DFT-of-AC)  essentially  performs  an  equivalent  task  of  separating  the  signal 
and  noise  subspaces.  Furthermore,  when  the  signal-subspace  part  of  the  DFT-of-AC  vectors  are  used  in  MNM, 
almost  identical  high-resolution  AO  A  estimates  are  produced.  The  results  have  been  published  as  a  Journal  paper 
[110]  and  has  been  presented  at  ICASSP-94  [119].  It  may  be  noted  here  that  according  to  one  of  the  anonymous 
reviewers  of  the  journal  paper,  this  work  is  a  “significant  breakthrough  in  source  localization”. 

In  the  later  part  of  this  Section,  we  present  a  detailed  theoretical  Perturbation  Analysis  of  the  estimates 
produced  by  the  D-MNM  algorithm.  The  theoretical  results  closely  corroborate  and  confirm  the  superior  perfor¬ 
mance  observed  in  simulations.  The  results  indicate  that  the  high-resolution  performance  of  D-MNM  is  uniformly 
superior  than  its  eigen-based  counterpart,  especially  at  low  SNR.  The  performance  is  also  superior  than  the  eigen- 
based  root-MUSIC  method  at  low  SNR.  Furthermore,  D-MNM  appears  to  provide  better  success  rate  among  all 
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methods  at  low  SNR.  Close  match  between  the  theoretical  and  simulated  performance  verifies  the  validity  of  the 
formulae  derived  here.  Preliminary  work  has  been  presented  at  ASILOMAR-94  [118]  and  a  detailed  version  is 
under  preparation  for  a  Journal  paper  [125]. 

Section  -  2.2  :  Maximum- Likelihood  Method  with  Exact  Constraints ;  A  recently  proposed  approximate 
Maximum-Likelihood  Estimator  (MLE)  of  multiple  exponentials,  converts  the  frequency  estimation  problem  into 
a  problem  of  estimating  the  coefficients  of  a  z-polynomial  with  roots  at  the  desired  frequencies.  Theoretically, 
the  roots  of  the  estimated  polynomial  should  fall  on  the  unit  circle.  But  MLE,  as  originally  proposed,  does  not 
guarantee  unit  circle  roots.  This  drawback  sometimes  causes  merged  frequency  estimates,  especially  at  low  SNR. 
If  all  the  sufficient  conditions  for  the  z-polynomial  to  have  unit  circle  roots  are  incorporated,  the  optimization 
problem  becomes  too  nonlinear  and  it  loses  the  desirable  weighted-quadratic  structure  of  MLE.  In  this  work,  the 
exact  constraints  are  imposed  on  each  of  the  Ist-order  factors  corresponding  to  individual  frequencies  for  ensuring 
unit  circle  roots.  The  constraints  are  applied  during  optimization  alternately  foi  each  frequency.  In  the  absence  of 
any  merged  frequency  estimates,  the  RMS  values  more  closely  approach  the  theoretical  Cramer-Rao  (CR)  bound 
at  low  SNR  levels.  The  work  has  been  published  as  a  Journal  paper  [15]. 

Section  -  2.3  :  Improved  AR-Parameter  Estimation  From  Noisy  Observation  Data  :  Auto-Regressive 
(AR)  modeling  is  the  most  widely  used  approach  for  model-based  spectrum  estimation.  But  almost  all  the 
existing  methods  for  AR-parameter  estimation  show  severe  degradation  if  the  observed  signal  is  corrupted  with 
noise.  In  fact,  all  the  commonly  used  techniques,  such  as.  Autocorrelation  Method  (AM),  Covariance  Method 
(CM),  Modified  Covariance  Method  and  their  variations,  give  poor  Power  Spectral  Density  (PSD)  estimates 
when  the  observations  are  noisy.  In  this  part  of  the  project,  a  data-adaptive  pre-filtering  approach  is  presented 
to  address  this  problem.  The  results  indicate  that  when  only  noisy  data  is  available  for  modeling,  the  proposed 
technique  gives  more  accurate  PSD  estimates  than  the  commonly  used  methods.  A  conference  paper  on  this  work 
have  been  accepted  [121]  and  a  more  comprehensive  version  is  under  preparation  for  publication  as  a  Journal 
paper. 

Section  -  2.4  :  Improved  ARMA-Parameter  Estimation  IVom  Noisy  Observation  Data  :  Existing 
methods  for  ARMA  modeling  assume  that  the  available  process  is  produced  by  an  ARMA  system  driven  by  a 
white  input  process,  i.e.,  the  observed  process  is  considered  to  be  pure  ARMA.  In  practice,  the  available  data 
usually  have  observation  noise  added  to  it  but  the  ARMA  methods  do  not  address  this  problem.  Simulations 
show  that  performance  of  the  existing  ARMA  methods  deteriorate  when  the  observation  process  is  noisy.  In  this 
part  of  the  project  a  new  ARMA  algorithm  is  given  which  utilizes  a  recently  developed  deterministic  rational 
system  identification  method  (OM-IO)  that  minimizes  the  modeling  or  output  error  norm.  The  algorithm  first 
estimates  the  input  process  and  then  invokes  OM-IO  using  the  input-output  data.  Simulations  indicate  that  the 
proposed  method  is  quite  effective  even  at  low  SNR  observation  data.  A  conference  paper  on  this  work  has  been 
accepted  [120]  and  a  more  comprehensive  version  is  under  preparation  for  publication  as  a  Journal  paper. 

Section  -  2.5  ;  Time-Dom€dn  Detection  of  Electronic  Warfare  Signals  in  Noise  :  Almost  all  existing 
AOA/RF  estimation  algorithms  assume  that  the  signal  is  already  present  in  the  observed  data.  But  in  the  passive 
mode  of  operations  of  EW  applications,  source  signals  may  not  be  present  at  all  within  the  observation  window, 
or  the  signals  may  fill  only  a  part  of  the  estimation  window.  In  either  case,  any  frequency  estimation  algorithm 
would  essentially  produce  erroneous  or  noise  frequencies  because  the  observed  signal  would  not  satisfy  the  model 
assumed  by  the  estimation  algorithm.  Considering  the  relatively  high  computational  burden,  any  estimation 
method  should  be  invoked  only  when  a  detection  scheme  indicates  high  probability  of  presence  of  threat.  In 
this  part  of  the  project,  the  theory  of  detecting  sinusoids  from  Quantized  and  Noisy  time-domain  observation 
samples  have  been  developed.  The  theoretical  work  on  single/multiple  samples  is  mostly  complete.  Studies  with 
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Quantized  data  have  also  been  performed  and  the  results  appear  reasonably  good.  Lab  tests  for  the  Envelope 
Detection  and  Square-Law  cases  have  been  conducted  at  Wright  Labs  with  satisfactory  results. 

Section  -  2.6  :  Pipelined-Adaptive  Tracking  of  Multiple  Sinusoidal  Frequencies  ;  New  Pipelined- 
Adaptive  algorithms  are  proposed  for  tracking  multiple  Frequencies  or  Angles-of- Arrival  (AOA)  of  moving  targets. 
Pipelining  of  adaptive  filters  pose  a  critical  challenge  because  of  the  timing  mismatch  arising  from  the  feedback 
signals.  In  this  work,  some  relaxation  techniques  have  been  utilized  to  pipeline  adaptive  algorithms  for  high¬ 
speed  tracking  of  frequency /AO As.  Two  adaptive  tracking  algorithms  have  been  mapped  into  pipelined  forms, 
namely  Least-Mean  Squares  (LMS)  and  Recursive  Least-Squares  (RLS).  Preliminary  results  have  been  presented 
at  ISCAS-96  [115]  and  a  Journal  version  is  under  preparation  for  possible  publication  [129]. 

CHAPTER  3.  System  Identification  and  Hardware  Implementation  Problems 

Fundamental  contributions  have  been  made  in  1-D  and  2-D  Rational  System  Identification  theory.  Several  key 
journal  papers  have  been  published/accepted  and  a  number  of  conference  publications  have  also  been  generated. 
The  proposed  comprehensive  framework  encompasses  a  large  class  of  Identification  problems  including,  (a)  Input- 
Output  data  [18,  124],  (b)  Impulse  Response  Data  :  AR  case  [19,  123],  ARMA  case  [Ill]  and  (c)  Frequency 
Response  Data  [13,  122],  (d)  Multivariable  System  Identification  [17]  and  also  for  shaping  Time  responses  of 
Minimum  Phase  Systems  [14].  Key  results  are  summarized  below. 

Section  -  3.1  :  Identification  of  1-D  Rational  Systems  from  Input-Output  Data  ;  A  theoretical  and 
algorithmic  framework  is  proposed  for  optimal  identification  of  rational  transfer  function  parameters  of  discrete¬ 
time  linear  systems  from  Input-Output  (10)  data.  The  nonlinear  criterion  is  theoretically  decoupled  into  a  purely 
linear  problem  for  estimating  the  optimal  numerator  and  a  nonlinear  problem  for  the  optimal  denominator.  The 
proposed  decoupled  approach  has  reduced  computational  requirements  when  compared  to  existing  algorithms 
which  estimate  the  parameters  simultaneously.  This  research  has  led  to  one  Journal  paper  [18]  and  a  Conference 
paper  [124]. 

Section  -  3.2  :  Identification  of  1-D  Rational  Systems  in  the  Frequency  Domain  :  A  new  Frequency- 
Domain  (FD)  approach  has  been  developed  for  optimal  estimation  of  rational  transfer  functions  coefficients.  The 
proposed  method  seeks  to  match  any  arbitrarily-shaped  FD  specifications  in  the  Least-Squares  (LS)  sense.  The 
desired  specifications  maybe  arbitrarily  spaced  in  frequency.  The  design  is  performed  directly  in  the  digital  domain 
and  no  analog  to  digital  transformation  is  necessary.  The  proposed  method  makes  use  of  the  inherent  mathematical 
structure  in  this  rational  modeling  problem  to  theoretically  decouple  the  numerator  and  denominator  estimation 
problems  into  two  smaller  dimensional  problems.  The  denominator  criterion  is  nonlinear  but  possesses  a  weighted- 
quadratic  structure  which  is  convenient  for  iterative  optimization.  The  optimal  numerator  is  found  linearly  by 
solving  a  set  of  simultaneous  equations.  The  decoupled  criteria  retain  the  global  optimality  properties.  The 
performance  of  the  algorithm  is  demonstrated  with  some  simulation  examples.  This  research  has  led  to  one 
Journal  paper  [13]  and  a  Conference  paper  [122]. 

Section  -  3.3  :  Identification  of  All-Pole  Rational  Systems  in  Time-Domain  :  An  algorithm  is  proposed 
for  optimal  estimation  of  the  parameters  of  Auto-Regressive  (AR)  or  all-pole  transfer  function  models  from 
prescribed  impulse  response  data.  The  transfer  function  coefficients  are  estimated  by  minimizing  the  f2-iiorm  of 
the  exact  model  fitting  error.  Existing  methods  either  minimize  equation  errors  or  modify  the  true  non-linear 
fitting  error  criterion.  In  the  proposed  method,  the  multidimensional  nonlinear  error  criterion  has  been  decoupled 
into  a  purely  linear  and  a  nonlinear  subproblem.  Global  optimality  properties  of  the  decoupled  estimators 
have  been  established.  For  data  corrupted  with  Gaussianly  distributed  noise,  the  proposed  method  produces 
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Maximum- Likelihood  Estimates  (MLE)  of  the  AR-parameters.  The  inherent  mathematical  structure  in  the  non¬ 
linear  subproblem  is  exploited  in  formulating  an  efficient  iterative  computational  algorithm  for  its  minimization. 
The  proposed  algorithm  provides  an  useful  computational  tool  based  on  appropriate  theoretical  foundation  for 
accurate  modeling  of  all-pole  systems  from  prescribed  impulse  response  data.  The  effectiveness  of  the  algorithm 
has  been  demonstrated  with  several  simulation  examples.  This  research  has  led  to  one  Journal  paper  [19]  and  a 
Conference  paper  [123]. 

Section  -  3.4  :  Design  of  Denominator  Separable  2-D  HR  Filters  :  This  work  extends  the  1-D  results 
in  [Ill]  to  2-D  system  identification.  In  this  part  of  the  report,  the  optimal  design  of  an  important  class  of 
two-dimensional  (2-D)  digital  IIR  filters  from  spatial  impulse  response  data  is  addressed.  The  denominator  of 
the  desired  2-D  filter  is  assumed  to  be  separable  into  two  1-D  factors.  The  filter  coefficients  are  estimated 
by  minimizing  the  ^2-norm  of  the  error  between  the  prescribed  and  the  estimated  spatial  domain  responses. 
The  denominator  and  numerator  estimation  problems  are  theoretically  decoupled  into  separate  problems.  The 
decoupled  criteria  have  reduced  dimensionality.  The  denominator  criterion  is  simultaneously  optimized  w.r.t.  the 
coefficients  in  both  dimensions  using  an  iterative  algorithm.  The  numerator  coefficients  are  found  in  a  straight¬ 
forward  manner.  If  the  desired  response  is  known  to  be  symmetric,  the  proposed  algorithm  can  be  constrained 
to  have  separable- denominators.  Initial  results  have  been  published  as  a  Journal  paper  [16]  and  some  further 
developments  are  currently  being  considered  [12]. 

Section  -  3.5  :  Optimal  Frequency  Domain  Design  of  Denominator  Separable  Two-Dimensional 
Digital  IIR  Filters  :  Classical  design  techniques  using  Butterworth,  Chebyshev  or  Elliptic  polynomial  are  only 
limited  particular  types  of  design  specifications,  such  as  Bandpass,  lowpass  etc.  A  least-squares  technique  is 
presented  for  designing  quarter-plane  separable-denominator  2-D  IIR  filters  to  best  approximate  prescribed  fre¬ 
quency  domain  (FD)  specification  of  any  arbitrary  shape.  Structured  Matrix  Approximation  approach  is  utilized 
to  show  that  the  FD  error  vector  is  linearly  related  to  the  2-D  numerator  coefficients  whereas  the  relationship 
with  the  2-D  denominators  is  quasi-linear.  Furthermore,  the  numerator  and  denominator  estimation  problems  are 
theoretically  decoupled.  The  quasi-linear  relationship  with  the  denominator  is  used  to  formulate  an  algorithm  for 
iterative  estimation  of  the  denominator.  The  numerator  is  found  in  one  step  using  the  estimated  denominator. 
Computer  simulations  show  the  effectiveness  of  the  proposed  method  and  its  superior  performance  compared  to 
several  existing  methods.  This  work  has  been  presented  at  ICASSP-95  [117].  A  detailed  version  is  also  under 
preparation  for  a  Journal  paper  [130]. 

Section  -  3.6  :  Optimal  Spatial- Domain  Design  of  2-D  IIR  Filters  :  In  this  Section  we  present  a  structured 
matrix  approximation  framework  to  develop  the  most  general  form  for  optimal  least-squares  (LS)  design  of  2-D 
recursive  filters  from  prescribed  spatial  domain  data.  Unlike  the  work  in  Section  3,4,  no  separability  is  assumed 
for  the  2D  denominator.  Utilizing  matrix  structures  inherent  in  this  problem  it  is  shown  that  the  exact  ^2  error 
has  a  purely  linear  relationship  with  the  2-D  numerator  parameters  whereas  the  2-D  denominator  coefficients  are 
nonlinearly  related  to  the  error.  But  more  interestingly,  the  denominator  and  numerator  estimation  problems 
are  theoretically  decoupled  into  separate  problems  without  affecting  any  optimality  properties.  In  the  decoupled 
form,  the  numerator  estimation  problem  is  shown  to  be  purely  linear.  For  estimating  the  denominator  also,  it  is 
shown  that  the  decoupled  £2  error  vector  possesses  a  quasi-linear  relationship  with  the  denominator  coefficients. 
Decoupled  estimation  leads  to  reduced  computational  complexity  because  there  is  no  need  for  iterating  on  the 
numerators.  Simulation  results  indicate  that  for  several  common  filer  design  problems,  the  proposed  general 
version  performs  better  than  the  separable  design  developed  earlier  in  Section  3.4.  Preliminary  results  have  been 
presented  at  ISCAS-95  [115]  and  a  Journal  version  is  under  consideration  for  possible  publication  [II], 

Section  -  3.7  :  Distributed  Look-Ahead  :  A  General  Approach  for  Pipelining  Recursive  Digital  Fil- 
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ters  :  A  new  Look-Ahead  (LA)  scheme,  Distributed  Look-Ahead  (DLA),  is  proposed  for  pipelined  implementation 
of  recursive  digital  filters.  It  is  established  that  in  case  of  many  recursive  filters,  DLA  can  provide  equivalent  and 
stable  implementation  with  reduced  pipeline  delay  and  hardware  complexity,  when  compared  with  some  existing 
LA  schemes.  Xhe  existing  Scattered  Look-ahead  implementation  achieves  stability  at  the  cost  of  increased  multi¬ 
plication  and  latch  complexities  and  considerable  delay  in  output  generation.  The  Clustered  look-ahead  approach 
can  not  always  guarantee  stability.  This  work  shows  that,  in  order  to  attain  stability,  the  output  samples  need 
not  be  clustered  or  equally  scattered.  Indeed,  in  many  filter  design  problems,  stability  can  be  maintained  by  using 
unequally  distributed  past  output  samples.  When  compared  with  the  scattered  approach,  the  proposed  scheme 
uses  fewer  number  of  pole-zero  cancelations  and  the  introduced  roots  are  not  necessarily  at  the  same  radii  as 
the  original  filter  poles.  Hence,  the  proposed  DLA  scheme  has  reduced  multiplication  and  latch  complexities, 
higher  area-efficiency  and  it  produces  outputs  with  reduced  delays.  Preliminary  results  have  been  presented  at 
ICASSP-96  [113]  and  ISCAS-96  [114]  and  a  Journal  version  is  under  preparation  for  possible  publication  [126]. 

Section  -  3.8  :  Optimal  Least-Squares  Design  of  Pipelined  Recursive  Filters  in  the  Time-Domain 
:  Currently,  look-ahead  (LA)  pipelined  recursive  filters  are  obtained  primarily  via  transformation  of  a  given 
un-pipelined  transfer  function.  For  these  approaches,  it  is  assumed  that  the  un-pipelined  transfer  function  has 
already  been  designed  as  an  intermediate  step.  In  this  Section,  we  present  a  new  algorithm  (OM-LA)  for  direct 
and  optimal  estimation  of  the  coefficients  of  recursive  filters  in  look-ahead  pipelined  form.  OM-LA  is  developed  by 
appropriate  modification  of  a  recently  proposed  optimal  method  (OM)  for  designing  un-pipelined  filters  (developed 
previously  by  the  PI  as  part  of  a  project  supported  by  the  AFOSR).  It  is  demonstrated  that  the  proposed  one-step 
approximation  can  achieve  superior  match  with  reduced  pipelined  filter  order  because  it  does  not  rely  on  pole-zero 
cancelations  as  in  current  LA  pipelining  approaches.  It  is  also  shown  that  the  denominator  polynomial  can  be 
constrained  to  possess  any  of  the  possible  look-ahead  configurations.  Unlike  some  existing  methods,  OM-LA 
minimizes  the  true  time-domain  fitting  error-norm  between  the  prescribed  and  the  estimated  impulse  response 
and  produces  superior  results.  Preliminary  results  have  been  presented  at  ICASSP-96  [112]  and  a  Journal  version 
is  under  preparation  for  possible  publication  [127]. 

Section  -  3.9  :  Pipelined  Look-Ahead  Implementation  of  a  Class  of  2-D  HR  Filters  :  In  Section-3.7, 
we  have  presented  a  new  scheme  (referred  to  cis  distributed  look-ahead)  which  is  a  compromise  between  the  two 
existing  look-ahead  approaches  for  high  speed  implementation  of  1-D  Recursive  Digital  filters.  To  date  neither 
the  Scattered  Look-ahead  nor  the  Distributed  scheme  has  so  far  been  utilized  for  2-D  HR  filter  implementation, 
primarily  because  the  i-D  stability  properties  of  these  LA  schemes  do  not  necessarily  translate  to  general  2-D 
HR  filters.  The  primary  focus  of  this  paper  is  to  demonstrate  that  for  a  special  but  very  important  class  of  2-D 
HR  filters,  namely  for  Denominator  Separable  configurations,  the  benefits  of  these  stable  look-ahead  schemes  can 
indeed  be  taken  advantage  of.  The  results  will  be  submitted  for  review  to  ASILOMAR-97  which  will  be  held  in 
November  at  Naval  Postgraduate  School  [128].  A  detailed  version  is  also  under  preparation  for  a  Journal  paper 
[131]. 
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CHAPTER  2 


THE  DIGITAL  MICROWAVE  RECEIVER  DESIGN  PROBLEMS 


Introduction 

Digital  processing  of  microwave  signals  in  Electronic  Warfare  (EW)  environment  poses  a  great  challenge  to 
researchers  in  Signal  Processing.  Along  with  the  standard  requirements  of  any  conventional  radar,  EW  receiver 
design  problem  is  complicated  by  the  fact  that  no  knowledge  about  the  input  signal  is  available  to  the  receiver. 
The  nature  of  the  problem  also  requires  that  measurements  and  decisions  be  taken  immediately  or  within  a 
few  seconds  in  an  entirely  passive  mode  of  operation.  All  microwave  receivers  used  in  practice  utilize  analog 
signal  processing  techniques.  The  frequency-band  of  the  EW  signals  are  in  the  GHz  range  and  the  signals  have 
wide  bandwidths  which  necessitate  sampling  and  processing  of  a  massive  amount  of  data  at  or  near  real-time. 
Presently,  no  EW  receiver  processes  microwave  radar  signals  entirely  in  the  digital  domain.  But  it  is  expected 
that  with  the  emergence  of  increasingly  faster  and  inexpensive  digital  computers  and  high-speed  A/D  converters, 
digital  processing  of  microwave  signals  would  most  certainly  be  the  way  of  the  future. 

In  the  past  two  decades,  many  classes  of  radar  and  sonar  receivers  have  been  converted  from  conventional 
analog  technology  to  purely  digital  or  hybrid  systems,  but  EW  receivers  are  yet  to  make  such  a  transition.  The 
primary  technological  factors  that  have  been  holding  back  possible  fabrication  of  any  digital  EW  receiver  are 
probably  twofold.  Firstly,  if  Analog-to-Digital  (A/D)  converters  are  to  be  used  at  the  operating  frequency  range, 
then  the  Nyquist  rate  would  necessitate  sampling  at  the  GHZ  range  and  secondly,  the  digital  hardware  or  firmware 
must  have  the  capacity  to  process  such  high  data  rate  and  produce  effective  results  at  near  real-time. 

Digital  EW  receivers  can  be  expected  to  offer  some  major  advantages  over  their  analog  counterparts.  Foremost 
among  these  is  the  almost  lossless  storage  capability  of  digital  memories  which  can  eliminate  the  dependence  on 
lossy  analog  delay  lines.  Digital  processors  and  memory  chips  are  relatively  inexpensive,  compact  in  size  and  have 
low  weight  and  the  trends  are  towards  even  further  reductions.  Digital  signal  processing  algorithms  and  digital 
computing  technology  have  matured  tremendously  and  offer  a  wide  range  of  capabilities.  Parallel  processing, 
pipelining,  RISC,  VLSI  design,  systolic  architecture,  vectorization  and  array  processing,  fault  tolerant  computing 
and  etc.,  are  only  some  of  the  well-known  aspects  of  digital  computing  that  the  last  few  decades  of  research  have 
produced.  As  our  research  progresses,  we  intend  to  study  if  some  of  these  ideas  can  be  incorporated  in  the  digital 
receiver  in  order  to  improve  the  efficiency  and  accuracy  of  its  performance. 

The  primary  task  of  a  microwave  receiver  is  to  gather  data  for  sorting  of  signals  and  for  identifying  the 
radar-type.  Based  on  these  information,  jamming,  weapon  delivery  or  other  options  are  considered.  In  order  to 
perform  these  tasks,  the  receiver  must  analyze  the  received  radar  pulses  and  measure  or  estimate  the  following 
six  parameters  :  Angle-of-Arrival  (AOA),  Radio  Frequency  (RF),  Time  of  Arrival  (TOA),  Pulse  Amplitude  (PA), 
Pulse  Width  (PW)  and  Polarization  (P). 

A  critical  requirement  of  an  EW  receiver  is  the  AOA  measurement  which  is  known  to  be  a  rather  difficult 
multidimensional  nonlinear  optimization  problem,  especially  when  multiple  closely-spaced  threats  are  to  be  re¬ 
solved.  It  is  also  desirable  to  have  high  sensitivity  and  large  dynamic  range  such  that  a  broad  range  of  signals, 
including  weak  ones,  can  be  detected. 

As  part  of  this  project  several  AOA/frequency  estimation  algorithms  has  been  developed  and  studied.  Most 
existing  high-resolution  frequency-estimation  algorithms  rely  on  special-purpose  hardware  or  software,  such  as, 
Eigendecomposition  or  SVD.  In  Section  2.1,  a  DFT-based  Minimum-Norm  method  is  proposed  which  does  not 
require  any  eigendecomposition  but  produces  high-resolution  frequency  estimates.  This  new  algorithm  needs 
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only  to  compute  the  DFT  of  the  Autocorrelation  matrix  to  separate  the  signal  and  noise  subspaces.  Hence  the 
computational  burden  is  much  lower  than  existing  high-resolution  methods.  Therefore,  this  algorithm  appears 
to  be  very  well-suited  for  EW  applications.  The  Statistical  Performance  Analysis  of  this  new  algorithm  has  also 
been  performed  and  the  results  are  included  in  Section  2.1  also. 

Another  new  class  of  algorithms,  referred  to  as  KiSS/IQML,  have  been  developed  recently  for  obtaining 
the  Maximum  Likelihood  Estimates  (MLE)  of  frequencies  or  AO  As  from  the  roots  of  z— polynomials.  But  the 
estimated  roots  are  not  guaranteed  to  fall  on  the  unit  circle,  as  desired.  Based  on  the  theory  on  zeros  of 
polynomials,  a  new  scheme  is  proposed  in  Section  2.2  here  that  will  ensure  unit  circle  roots.  Many  frequency 
estimation  methods  make  use  of  the  property  that  a  sinusoidal  process  can  be  modeled  as  a  limiting  case  of  a 
narrow-band  auto-regressive  (AH)  process.  But  the  performance  of  all  existing  AR  parameter  estimation  methods 
degrade  significantly  when  the  observation  data  is  corrupted  with  noise.  A  pre-filtering  approach  is  presented 
in  Section  2.3  that  can  improve  AR-parameter  estimates  from  noisy  observation  data.  Another  data-adaptive 
approach  for  improved  modeling  of  ARMA  processes  from  noisy  observations  is  given  in  Section  2.4. 

Parameter  estimation  schemes  either  follow  or  work  in  parallel  with  a  detection  scheme  ensuring  the  presence 
of  any  threat.  A  combined  detection-estimation  scheme  has  the  potential  to  cut-down  computational  burden  on 
the  signal  processor.  As  a  part  of  this  project,  statistical  theory  on  hypothesis  testing  has  been  utilized  for 
detecting  whether  a  threat  is  present  or  not.  In  Section  2.5,  the  time-domain  detection  problem  has  been 
presented  for  single  and  multiple  samples.  Specifically,  the  detection  thresholds  and  Probability  of  Detection 
based  on  Ney man- Pearson  Criterion  have  been  derived. 

The  AOA  estimation  algorithms  presented  in  Sections  2. 1-2.2  work  on  batch  mode  where  the  targets  are 
assumed  to  be  ” locally  stationary”.  However,  in  many  practical  situations  the  targets  may  be  non-stationary  an 
it  is  desirable  to  track  its  movements  adaptively.  In  these  regards.  Pipelined- Adaptive  algorithms  are  proposed 
in  Section  2.6  for  tracking  multiple  Frequencies  or  Angles-of-Arrival  (AOA)  of  moving  targets.  Pipelining  of 
adaptive  filters  pose  a  critical  challenge  because  of  the  timing  mismatch  arising  from  the  feedback  signals.  In  this 
work,  some  relaxation  techniques  have  been  utilized  to  pipeline  adaptive  algorithms  for  high-speed  tracking  of 
frequency  /AO  As . 
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Section  -  2.1  :  SUPERRESOLUTION  WITHOUT  EIGENDECOMPOSITION  :  METHOD  AND 

PERTURBATION  ANALYSIS 


SUMMARY 

Many  existing  methods  for  estimating  closely  spaced  sinusoidal  frequencies  utilize  the  eigenvectors  of  Auto¬ 
correlation  (AC)  matrices  [1,  10,  15,  26,  30].  Instead,  this  work  considers  the  use  of  the  DFT  of  AC  matrices 
(DFT-of-AC)  for  extracting  the  signal  and  noise  subspaces.  When  the  signal-subspace  part  among  the  DFT-of-AC 
vectors  are  used  in  the  Minimum-Norm  method  (MNM)  framework,  almost  identical  high-resolution  frequency 
estimates  are  produced.  Theoretical  Perturbation  Analysis  of  the  proposed  DFT-based  MNM  (D-MNM)  has  also 
been  carried  out.  The  analysis  confirms  that  the  estimates  are  theoretically  unbiased  and  have  lower  theoretical 
Mean-Squared  Error  indicating  improved  high-resolution  performance,  especially  at  low  SNR.  The  primary  advan¬ 
tages  of  extracting  signal/noise  subspace  information  from  DFT-of-AC  are  reduced  computational  and  hardware 
complexity  than  existing  methods  that  need  to  perform  the  Eigendecomposition  iteratively. 

I ;  Introduction 

In  many  important  practical  applications,  such  as  radar,  sonar  and  astronomy  etc.,  the  resolution  capability 
of  FFT  is  inadequate.  Overcoming  the  resolution  limitation  of  DFT  has  been  a  vigorously  researched  topic 
in  Signal  Processing  in  the  past  three  decades.  The  modern  methods  attain  the  desired  ‘High-Resolution’  or 
‘Superresolution'  at  the  cost  of  considerable  computational  burden.  The  existing  well-known  methods  often  utilize 
Eigen-Decomposition  (ED),  Singular  Value  Decomposition  (SVD)  or  Maximum  Likelihood  (ML)  method  which 
is  based  on  nonlinear  optimization  [1-5,  8-10,  14-20,  22-24,  26,  29,  30,  32-37,  42-45,  48,  49].  These  algorithms, 
though  highly  effective,  can  only  be  implemented  iteratively  because  of  their  inherent  nonlinearity,  which  limits 
their  real-time  capabilities. 

The  primary  objective  of  this  paper  is  to  effectively  combine  the  computational  simplicity  of  DFT  with  the 
underlying  mathematical  philosophy  of  certain  high-resolution  methods.  The  desired  goal  is  to  achieve  high- 
resolution  without  any  iterative  optimization.  Specifically,  many  existing  high-resolution  techniques,  such  as  the 
Minimum-Norm  method  (MNM),  extract  the  signal  and  noise  subspace  information  from  the  eigenvectors  of  the 
Autocorrelation  (AC)  matrices  [15,  26].  It  is  shown  in  this  paper  that  the  DFT  of  the  AC-matrix  (DFT-of-AC) 
essentially  performs  an  equivalent  task  of  extracting  and  decoupling  the  signal  and  noise  subspace  information. 
Hence,  it  is  proposed  that  the  signal  eigenvectors  be  replaced  by  the  largest-norm  DFT-of-AC  vectors.  It  is 
demonstrated  that  when  the  DFT-of-AC  vectors  with  larger  norms  are  used  in  the  MNM  framework,  mostly 
better  or  almost  equivalent  high-resolution  DO  A  estimates  are  produced.  The  bias,  mean-squared  error  and 
the  root  locations  of  the  proposed  DFT-based-MNM  (D-MNM)  also  compare  well  with  the  Eigendecomposition- 
based  MNM  (E-MNM).  The  simulations  further  show  that  the  high-resolution  performance  of  the  D-MNM  is 
more  robust  at  low  SNR. 

In  order  to  establish  theoretical  justification  of  the  performance  of  the  D-MNM  algorithm,  we  have  also 
conducted  theoretical  Perturbation  Analysis  of  the  estimates  produced  by  the  algorithm.  The  theoretical  study 
corroborates  closely  with  the  superior  performance  already  observed  in  simulations.  The  details  of  the  following 
are  included  after  introducing  the  method.  Firstly,  the  Bias  and  the  Mean-Squared-Error  (MSE)  in  the  estimates 
of  the  AOAs  are  shown  to  be  linearly  related  to  the  Bias  and  MSE  in  the  roots  of  the  D-MNM  polynomial 
which,  in  turn,  are  shown  to  depend  on  the  Bias  and  MSE  for  the  D-MNM  coefficient  vector.  Then  the  statistics 
of  the  coefficient  error  are  related  to  those  of  the  observed  data  and  the  AC  matrix  estimate.  Finally,  all 
the  intermediate  results  are  combined  and  utilized  to  find  the  direct  statistical  relationship  between  the  AOA 
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errors  and  the  observations.  The  theoretical  results  indicate  that  the  high-resolution  performance  of  D-MNM  is 
uniformly  superior  than  its  eigen-based  counterpart,  especially  at  low  SNR.  The  performance  is  also  superior  than 
the  eigen-based  root-MUSIC  method  at  low  SNR.  Furthermore,  D-MNM  appears  to  provide  better  success  rate 
among  all  methods  at  low  SNR.  The  theoretical  analysis  closely  follows  the  performance  with  simulated  data, 
which  verifies  the  validity  of  the  formulae  derived  here  theoretically. 

The  major  significance  of  the  proposed  algorithm  is  that,  no  complicated  iterative  optimization  is  needed  and 
the  signal-subspace  information  is  extracted  only  by  a  single  mettrix  tnultiplicdtion.  Hence,  hardware  implementa¬ 
tion  of  D-MNM  for  real-time  high-resolution  AOA/Frequency  estimation  may  be  feasible  with  currently  available 
technology.  It  may  be  noted  here  that  results  on  some  preliminary  simulations  on  the  DFT-based  method  was 
presented  in  [38,  39],  though  no  performance  analysis  was  available  at  the  time. 

The  paper  is  organized  as  follows.  In  Section  II,  the  AOA  estimation  problem  is  defined  and  in  Section  III 
some  existing  approaches  are  discussed.  Some  useful  properties  of  the  AC  matrix  are  given  in  Section  IV.  Then  in 
Section  V,  the  proposed  D-MNM  algorithm  is  described.  In  Section  VI,  the  details  of  the  Perturbation  analysis  of 
D-MNM  are  presented.  Some  simulation  results  are  given  in  Section  VII  and,  finally  the  paper  is  wrapped  with 
some  concluding  remarks  in  Section  VIII. 

II  :  PROBLEM  DEFINITION 

This  paper  addresses  the  problem  of  estimating  of  the  Directions  of  Arrival  (DOA)  of  densely  spaced  narrow- 
band  targets.  Suppose  that  p  plane  waves  originating  from  far-field  point  sources  at  distinct  directions  impinge 
on  a  linear  array  of  N  equally  spaced  sensors.  The  signal  sampled  simultaneously  at  instant  of  time  at  N 
equally  spaced  sensors  form  a  ‘snapshot’  vector  defined  as, 

x,„  A  [x^(0)  x„(l)  . . .  XmiN  -  l)]^  (//.I) 

In  the  presence  of  noise,  the  observation  samples  can  be  written  as, 

a:m(n)  =  *m(«)  +  Zm(™) 

where,  Zm{n)  represents  the  additive  observation  noise  and/or  the  modeling  error  and  Xm(n)  denotes  the  signal 
part  of  the  observation,  which  is  given  by 

x^(n)  =  n  =  0, 1, . . .,  V  -  1  (JJ.3) 

*  =  1 

where, 

p  :  Number  of  narrowband  sources  present 

d  :  Spacing  between  sensor  elements 

A  :  Wavelength  of  radiation  of  the  received  signals 

Oi  :  Direction-of- Arrival  (DOA)  of  the  source 

Am{i)  :  Amplitude  of  the  source  at  the  snapshot 
^m(0  •  Phase  angle  of  the  source  at  the  snapshot. 

Uniformly  distributed  between  —  tt  and  tt. 

The  noise  Zm{n)  is  assumed  to  be  zero-mean  and  uncorrelated  with  the  source  signals  and  it  has  a  variance  of  <7^. 
The  signal  model  can  be  written  in  a  more  succinct  form  as, 

s=l 
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where,  w,-  and  Aim  Eire  defined  as 


III :  Existing  Methods 

It  is  apparent  from  the  problem  statement  that  the  DOA  (0i)  estimation  problem  is  mathematically  equivalent  to 
the  Frequency  Estimation  (wj)  problem  which  has  been  a  major  research  topic  in  many  areas  of  science.  Indeed, 
in  the  last  couple  of  hundred  years,  the  search  for  ‘hidden  periodicities’  from  observed  data  has  appeared  in  varied 
forms  in  several  seemingly  differing  disciplines  of  science. 

Ill.a  :  The  Periodogram  and  its  Resolution  Limitation 
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Ever  since  its  discovery  in  1965  [6],  the  FFT  has  been  the  primary  tool  for  estimating  Directions  of  Arrival 
(DOA)  or  frequencies  of  far-held  sources  from  noisy  observation  data.  The  software  or  hardware  implementation 
of  FFT  is  remarkably  straight-forward.  To  date,  the  periodogram  continues  to  be  the  most  frequently  used 
method  for  frequency/DOA  estimation  [21,  27].  In  fact,  it  is  well  known  that  for  localizing  a  single  target,  if 
the  noise  in  the  observed  data  is  Gaussianly  distributed,  the  periodogram  [27]  produces  the  maximum  likelihood 
estimate.  But  in  case  of  multiple  targets,  the  periodogram  cannot  resolve  two  frequencies  which  are  separated 
by  less  than  the  bin-width  of  the  FFT.  In  fact,  when  the  sources  are  spaced  at  less  than  the  DFT  bin-width, 
the  periodogram  fails  to  distinguish  two  closely  spaced  frequencies  and  only  provides  a  single  frequency  estimate 
instead  of  two.  The  last  statement  truly  portrays  the  problem  one  faces  while  resolving  two  closely  spaced  sinusoids 
when  a  relatively  short  data  record  is  available.  Clearly,  if  any  amount  of  data  is  available  for  processing,  the 
periodogram  of  sufficiently  Zero-padded  (and  possibly  Windowed)  data  will  provide  reasonably  good  estimates. 
But  in  many  problems  of  practical  interest  only  short  data  record  is  available  and  one  has  to  overcome  the 
periodogram ’s  resolution  limitation  by  resorting  to  what  are  commonly  known  in  the  signal  processing  literature 
as  ‘High-Resolution’  or  ‘Superresolution’  techniques.  The  major  contributions  in  the  higher  resolution  approaches 
are  highlighted  next. 

Ill.b  :  High-Resolution  Methods 

A  multitude  of  DOA/Frequency  Estimation  algorithms,  their  variations  and  analysis  are  available  in  the 
literature  [1-5,  7-20,  22-26,  28-30,  32-49].  In  the  following  paragraphs  only  some  of  the  major  developments  are 
briefly  discussed.' 

Minimum  Variance  Method  :  This  was  perhaps  the  earliest  high  resolution  methods  which  was  specifically  devel¬ 
oped  for  frequency-wavenumber  estimation.  In  order  to  improve  upon  Periodogram’s  resolution  limit.  Capon  had 
proposed  a  Minimum  Variance  method  which  is  a  linear  estimator  that  minimizes  the  interference  at  frequencies 
outside  the  band  of  interest  [4].  Its  performance  has  been  shown  to  be  better  than  the  periodogram  estimator 
but  worse  than  the  modeling  based  estimators  [20]. 

Model-Based  Methods  :  A  major  motivation  for  many  modern  high-resolution  frequency  estimation  methods  has 
come  from  the  desire  to  achieve  more  exact  models  for  the  sinusoids-in-noise  data.  In  the  Parameter  Estimation 
area  of  statistical  time-series  analysis,  it  had  been  well  established  that  Auto- Regressive  (AR)  modeling  is  very 
appropriate  for  modeling  data  with  peaky  spectra.  But  in  the  frequency  estimation  field  also,  it  had  been  a 
common  knowledge  that  data  composed  of  sinusoids  in  noise  tend  to  have  peaky  spectra.  Consequently,  frequency 
estimation  based  on  AR-modeling  has  received  considerable  attention  [3,  9,  24,  29]. 

Depending  on  how  the  autocorrelation  values  are  estimated,  there  are  three  types  of  AR  parameter  estimation 
methods,  namely,  Autocorrelation  method.  Covariance  method  and  Modified  Covariance  method  (also  known  as 
the  Forward-Backward  method).  The  later  two  cases  are  more  appropriate  for  sinusoidal  processes  because  of 
their  implicit  relationship  with  Prony’s  method  which  provides  perfect  frequency  estimates  when  no  noise  is 
present.  Incidentally,  the  Maximum  Entropy  method  proposed  by  Burg  [7]  and  the  Linear  Prediction  based 
spectral  estimator  [9]  both  produce  essentially  identical  frequency  estimates  as  the  Covariance  method. 

When  p  sinusoids  are  present  and  a  p**  order  AR  model  is  used,  the  frequency  estimates  are  found  to  be 
poor  at  low  SNR  (<  30djB).  To  circumvent  this  hurdle,  larger  order  (I  >  p)  AR  model  has  been  proposed  [25, 
60].  The  larger  model  order  tends  to  accommodate  a  major  part  of  the  interfering  noise  and  thereby  reduces  the 
effect  of  noise  in  the  estimates.  The  larger-order  approach  performs  poorly  below  20dB  SNR. 
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Eigen-Analysis  of  the  Auto-Correlation  Matrix  of  Sinusoid-in-Noise  Data  :  AR  modeling  based  approaches  offer 
better  resolution  performance  than  their  predecessors  but  these  as  well  as  the  earlier  methods  are  basically 
general  spectrum  estimation  methodologies  applied  to  this  specific  narrow-band  problem.  Since  the  mid-to-late 
seventies,  a  whole  new  class  of  algorithms  are  being  developed  by  effective  exploitation  of  the  special  properties 
of  the  autocorrelation  matrix  of  the  sinusoids-in-noise  data.  For  A  =  p-|- 1,  the  eigendecomposition  of  C  was  first 
utilized  by  Pisarenko  [23]  who  showed  that  the  ^-polynomial  formed  with  elements  of  the  eigenvector  corresponding 
to  the  smallest  eigenvalue  has  roots  at  the  signal  frequencies.  Though  the  idea  is  elegant,  Pisarenko’s  method 
performs  quite  poorly  for  noisy  signals.  Pisarenko’s  approach  was  later  improved  upon  by  Kumaresan  [15]  where, 
for  N  >  p  cases,  all  the  noise  eigenvectors  had  been  utilized.  As  an  alternate  approach,  it  was  shown  in  [15]  that 
the  signal  subspace  eigenvectors  can  also  be  utilized  to  form  a  noise  subspace  vector  which  should  have  zeros  at 
the  signal  frequency  locations.  This  was  achieved  in  [15,  26]  by  formulating  a  Minimum-Norm  criterion  which  is 
the  framework  that  will  be  used  in  the  proposed  work. 

Another  major  improvement  on  Pisarenko’s  approach  was  presented  by  Schmidt  [30]  and  Bienvenue  and 
Kopp  [1].  They  proposed  to  combine  the  eigenvectors  corresponding  to  the  (L  -  p)  smaller  eigenvalues  of  C  and 
used  an  orthogonality  criterion  to  obtain  the  frequency  estimates.  In  the  literature,  this  approach  is  known  as 
the  ‘MUSIC’  method. 

It  may  be  pertinent  to  emphasize  here  that  the  approach  proposed  in  this  work  for  extracting  signal  or  noise 
subspace  ‘without  eigendecomposition’  may  be  combined  with  either  the  MNM  or  the  MUSIC  framework.  The 
MNM  framework  has  been  preferred  in  the  development  in  Section  V  because  in  case  of  the  Minimum-Norm 
method,  the  frequencies  are  found  directly  from  the  polynomial  roots.  On  the  other  hand,  a  search  procedure 
is  necessary  in  case  of  MUSIC  for  estimating  the  frequencies.  The  polynomial  version  of  MUSIC,  known  as 
‘root-MUSIC’,  could  also  be  used  but  in  that  case  the  order  of  the  z-polynomial  would  be  twice  that  of  MNM. 

Maximum- Likelihood  Method  :  This  class  of  algorithms  maximize  the  likelihood  function  for  the  observed  data, 
leading  to  optimization  of  a  non-linear  criterion  which  can  only  be  performed  iteratively.  Several  different  ap¬ 
proaches  are  available  in  the  literature  [16-18, 27-29,  33-35, 37, 48]  among  which  the  recently  proposed  Constrained 
MLE  approach  developed  [37]  by  the  first  author  appears  to  offer  the  most  accurate  results. 

Other  Methods  and  the  Motivation  for  the  Proposed  Method  :  As  listed  in  the  references,  there  are  a  large  number 
other  methods  that  address  the  high- resolution  Frequency/DOA  estimation  problems.  In  order  to  achieve  the 
desired  high-resolution  capability,  all  these  algorithms  utilize  some  form  of  eigen-analysis  or  non-linear  optimiza¬ 
tion,  both  of  which  are  computationally  intensive  for  real-time  applications.  The  primary  objective  of  this  paper 
is  to  study  whether  the  computational  simplicity  of  DFT  can  be  effectively  combined  with  the  underlying  math¬ 
ematical  framework  of  some  of  the  existing  high-resolution  methods.  The  final  goal  is  to  achieve  high-resolution 
without  any  iterative  optimization  such  that  real-time  implementation  may  be  feasible  with  existing  hardware. 
The  proposed  method  makes  use  of  the  special  properties  of  correlation  matrices  which  are  outlined  next. 

rv  :  Some  Properties  of  the  Autocorreiation  Matrix 

Since  the  data  described  by  (//.3)  is  uncorrelated,  zero  mean  WSS  process,  the  N  x  N  (N  >  p)  covariance 
matrix  C  will  have  the  following  matrix  decomposition  when  there  is  no  observation  noise, 

C  =  TST"  (IV.l) 

where,  S  A  diag  (<t?  a\  ...  <t^)  and  denotes  the  power  of  the  t-th  signal.  Note  that  this  ideal  C  has  rank 
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p.  In  this  case,  the  eigen-decomposition  of  C  can  be  written  as, 
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For  observations  with  noise  as  defined  in 


C  =  TST"  4- 


(7^.3) 


Note  that  this  theoretical  C  has  rank  N  though  the  signal  part,  TST^  has  rank  p.  In  this  case,  the  eigen- 
decomposition  of  C  can  be  written  as, 


CV=  [(Ai -t-<r2)vi  •••  (Ap-f<T|)vp  o-^Vp+i  ••• 


(IVA) 


where,  the  A,  ’s  and  crj  represent  the  signal  and  noise  eigenvalues.  But  in  practice,  the  eigendecomposition  has  to 
be  performed  onithe  sample  covariance  matrix  C  as  defined  in  (//.13)  and  then  the  noise  eigenvalues  will  not  be 
equal  but  will  be  absorbed  with  the  signal  eigenvalues  also.  In  that  case, 

CV  =  [Aivi  •••  ApVp  Ap+iVp+i  •••  Atvv^v]  UV5) 

where,  the  estimated  eigenvalues  are  ordered  as,  Ai  >  A2  >  ■  ■  •  Xn-  The  eigenvectors  corresponding  to  the 

p  largest  eigenvalues  are  called  the  ^signal  eigenvectors’  which  constitute  the  signal-subspace  .  All  the  other 
eigenvectors  are  known  as  the  ‘noise  eigenvectors’.  Note  also  that  the  p  ‘signal  eigenvectors’  of  C  span 
the  subspace  defined  by  the  columns  of  T  and  that  they  are  orthogonal  to  the  ‘noise  subspace’  eigenvectors. 


V  :  The  Proposed  DFT-Based  Minimum-Norm  method  (D-MNM) 


As  a  significant  departure  from  the  eigen-based  approaches  discussed  in  the  previous  section,  this  work  advocates 
that  the  signal-subspace  information  be  extracted  from  the  DFT-of-AC  matrix  which  can  be  accomplished  with 
a  single  matrix  multiplication.  This  will  eliminate  the  need  for  iterative  calculation  of  eigenvectors  which  is 
computationally  intensive.  The  central  idea  behind  the  DFT-of-AC  matrix  is  analyzed  first. 

V.a  :  Signal  and  Noise  Subspace  Extraction  from  the  DFT-of-AC  Matrix 

Let  the  DFT  matrix  be  denoted  as, 


D  A  [ei  02  •••  bn],  (VI) 

where,  the  elements  of  the  ifc-th  DFT-vector  ej,  is  defined  as,  efc(/)  =  for  k,  I  =  0,1,  2,  . . .,  iV  —  1.  If  the 

frequencies  WjS  are  all  on  the  DFT  bins  and  if  there  is  no  observation  noise,  then  in  general, 

ffc  A  Cek  (V.2a) 
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=  using  (7.136) 

m=l 
1  ^ 

=  ■j^'^iBim'^^^k)xm,  Using  (7.8) 


m=l 

1  ^ 

=  i-jra" 

A/  ^ 


m=l 


'■tfefc-' 


(K26) 


(K2c) 


(K2d) 


If  the  Ar-th  DFT  vector  corresponds  to  one  of  the  Wi  frequencies, 

.  M  ^  M 

f*:  =  ^  ^IkmTam  =  T—  ^  J^jferrjUm 


r  1  a 

=  T 

=  T 

^k 

1  A*  A 

i-  M  2^m=l  ^km^pm  J 

-^kp- 

(F.3) 


where,  6-fc;s  denote  the  covariance  of  the  complex  amplitudes.  Assuming  the  number  of  samples  M  to  be  large 
and  since  ire  independent  random  variables,  ^Akm,Ax„,  A  d-*j  =  Hence, 


Cfe  d-ftfe  =  &l&k-  (^-4) 

Note  that  the  norm  of  ffc  is  directly  proportional  to  the  signal  power,  i.e.,  this  norm  will  be  large  if  the  signal 
power  is  significant.  On  the  other  hand,  if  a  DFT-vector  ©jt  does  not  correspond  to  any  of  the  w,'  frequencies  then 
due  to  orthogonality,  tf^ej  =  0,  Vi.  For  such  cases. 


h  =  0.  (1^-5) 

For  this  ideal  case  then,  the  DFT-of-AC  has  the  following  decomposition, 

F  A  CD  (F:6a) 

A  [fi  f2  •••  fiv]  (^-6^) 

— [Aiui  •••  ApUp  0  •••  0]  (K6c) 


where,  the  AjS  and  iijS  are  the  lengths  and  unit  vectors  of  each  £,•,  respectively.  Note  that  the  unit  vectors  in  the 
matrix  in  (Vi6c)  have  been  rearranged  so  that  the  zero/nonzero  components  are  clustered  together.  Interestingly, 
this  decomposition  appears  to  be  very  similar  to  the  usual  Eigendecomposition  of  noiseless  and  ideal  C,  as  given  by 
(1^.2).  For  this  ideal  signal  scenario  again,  if  the  DFT-of-AC  is  formed  using  the  theoretical  and  noisy  Covariance 
matrix  of  (Vi3),  then  the  decomposition  has  the  form, 

F  =  CD 

=  TST^D  -I-  <r^D 

[(Ai-Kr2)ui  •••  (Ap-|-(r2)up  cr^Up+i  •••  tr^ujv], 
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{V.la) 

(VJb) 

(V.lc) 


where  the  Uj’s  have  been  arranged  in  decreasing  order  of  lengths.  Note  again  that  this  decomposition  is  analogous 
to  the  one  in  (K4).  In  this  case  also,  the  p  largest-norm  vectors  of  the  DFT-of-AC  matrix  contain  the  signal 
subspace  information. 

In  practice,  the  w,s  will  not  be  on  the  DFT  bins  and  the  observations  may  also  be  noisy  and  hence,  the 
decomposition  in  (K6)  or  (V.l)  will  not  hold.  But  the  DFT-components  (fts)  closer  to  the  signal  frequencies  will 
tend  to  have  larger  norms.  Hence,  for  the  general  scenario,  when  the  observation  data  is  noisy  and  the  angular 
frequencies  w,s  are  arbitrarily  spaced,  the  signal/noise  subspace  decomposition  can  be  formed  as  ; 


F  — >  [Aiui  •••  ApUp  I  Ap+iUp+i  •••  Ajvujv]  (K8a) 

A  A  [Us  1  Uw] 

A  [Fs  I  Fjv] 

A  [fi  f2  . . .  fp  :  fp+i  ...fN] 

where,  the  matrix  Fs  is  formed  with  p  number  of  f,-  vectors  having  larger  norms,  Ai  >  A2  >  •  •  •  >  Ajv  are  the 
norms  of  the  corresponding  fj  vectors  and  the  matrices  A,  Us  and  Vn  are  formed  as. 


< 

1 _ 

,  Us  A 

[1  1  • 
Ui  U2 

1  1 

••  n 

..  Up 

and,  Utv  A 

1  • 
Up+I  . 

1 1 
•  UjV 

Ap- 

LI  1  • 

••  1 J 

.  1 

1 

It  may  be  observed  again  that  the  decomposition  in  (1^.8)  is  analogous  to  the  eigen-based  counterpart  in  (7V^.5). 
It  may  also  be  noted  here  that  in  case  of  the  ideal  signal  cases  of  (K.6)  and  (V^.7),  an  unit  vector  Uj  corresponds  to 
one  of  the  DFT-vector  e*,,  but  in  the  general  case  of  (V".8),  they  are  linear  combinations  of  the  DFT-components 
close  to  the  signal  frequencies. 

V.b  :  Incorporation  of  DFT-Based  Signal  Subspace  in  the  Minimum-Norm  Framework 


The  principal  idea  behind  the  Minimum-Norm  method  is  to  form  an  appropriate  ‘noise-subspace’  vector  d 
which  is  orthogonal  to  the  ‘signal-subspace’  defined  by  Fs.  Let, 


D{z)  A 


iV-l 


iV.9) 


be  an  {N  -  l)-th  order  z-polynomial  with  p  zeros  at,  Zi  =  for,  i  =  1,  ...,p,  corresponding  to  the  DOAs. 
The  coefficient  vector  is  denoted  as. 


d  A  [do  di 


djv-i]^,  A 


(I/.IO) 


where,  do  =  1  and  d'  contains  the  unknown  coeflScients.  According  to  the  MNM  philosophy  [15],  if  Fs  does 
constitute  of  the  signal-subspace,  then  d  must  be  orthogonal  to  Fs,  i.e.. 


Ffd  =  0. 


(T.ll) 


d  needs  to  be  found  by  solving  this  underdetermined  set  of  equations  which  has  infinite  number  of  solutions. 
According  to  [15,  26],  the  solution  that  also  minimizes  the  norm  ||d|p,  possesses  the  desirable  property  that  all 
its  roots  fall  inside  the  unit  circle.  This  ‘minimum-norm’ solution  of  d  for  solving  (K.ll)  can  be  expressed  as  : 


d  = 


1 

-  G^(GG^)-ig 


{V.\2a) 
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where,  is  partitioned  as, 


F"A[g  I  G],  (V^.126) 

Once  d  is  estimated,  the  p  roots  of  D{z)  closest  to  the  unit  circle  are  used  to  find  the  DOAs.  It  may  be  recalled 
that  in  E-MNM  the  signal-subspace  eigenvectors  vi,  V2,  . . .,  Vp,  as  defined  in  (V".5)  are  used  to  form  F5  [15, 
26].  But  in  case  of  the  proposed  approach,  no  eigendecomposition  is  necessary.  Post-multiplication  of  C  by  the 
DFT-matrix  D  is  all  that  is  required  to  extract  the  signal  subspace  in  (P.8). 

V.c  :  Summary  of  the  Proposed  D-MNM  Algorithm 


The  key  steps  and  some  alternative  possibilities  are  summarized  in  this  Section. 


V.c.l  :  Algorithm  Steps 

1.  Form  the  Covariance  Matrix  estimate  using  forward-backward  method  [15]  : 

1  ^  H 

m=l 

The  ‘backward’  vector  is  defined  as  xj;,  A  Jx;;;,,  where,  J  denotes  the  permutation  matrix  with  I’s  at  the 
cross-diagonal  entries  and  *  denotes  the  complex-conjugate  operation. 

2.  Post-multiply  C  by  the  DFT  matrix  D  to  form  the  DFT-OF-AC  matrix,  F  A  CD. 

, 

3.  Form  Fs  as  in  (V".8c)  using  the  p  unit  vectors  corresponding  to  the  largest  norms.  Partition  Fs  as  in  (K126). 

4.  Estimate  the  d  vector  using  {VA2a)  and  form  the  D{z)  polynomial  using  the  elements  of  d. 

5.  Find  the  roots  of  D{z),  Pick  the  p  roots  closest  to  the  unit  circle  to  find  the  desired  frequencies/DOAs. 


V.C.2  :  Alternate  Possibilities 

Steps  2  and  3  :  Post-multiplication  of  the  AC-matrix  by  a  DFT-matrix  has  been  used  here  because  the  decom¬ 
positions  as  described  in  Section  V  appear  analogous  to  eigendecomposition.  But  it  is  easy  show  that  identical 
results  can  be  obtained  if  the  AC-matrix  is  pre-multiplied  by  a  DFT  matrix,  i.e.,  the  DFT-of-AC  matrix  can  also 
be  formed  alternately  as,  Fi  A  DC.  In  that  case,  the  largest  norm  row  vectors  of  the  DFT-of-AC  matrix  Fi 
must  be  used  to  form  F;^  defined  in  (K13). 

Step  4  ’  This  step  requires  inversion  of  a  matrix  of  dimension  {N  —  1)  x  {N  —  1).  This  can  be  avoided  by  orthog- 
onalizing  the  p  largest  norm  vectors  in  F5.  Let,  F^^  be  the  new  ‘signal-subspace’  matrix  with  the  orthonormal 
set  of  vectors  which  can  be  written  in  partitioned  form  as. 


F^"A[g<,  I  Go]. 

With  these  partitioned  matrices,  d  can  again  be  found  in  Step-4  as  [15], 


-  Gf  go/(l  -  gf  go)  J 


(P.14) 


(P.15) 


It  may  be  mentioned  here  that  in  [15],  p  orthonormal  signal  eigenvectors  were  used  to  form  F5,  whereas  here  F5 
is  formed  by  orthogonalizing  the  p  largest  norm  vectors  of  the  DFT-of-AC  matrix. 
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Step  5  :  This  step  requires  rooting  of  the  {N  -  l)-th  order  polynomial  D{z).  Instead,  the  frequencies  may  also 
be  found  from  the  peaks  of  the  following  minimum-norm  pseudo-spectrum  [15,  26,  40]  ; 


VI :  Perturbation  Analysis  of  the  D-MNM  Algorithm 

According  to  the  MNM  framework,  the  angles  of  arrival  Bi  are  extracted  from  the  p  roots  of  the  polynomial 
D{z)  that  are  on  or  closest  to  the  unit  circle.  Let  the  p  signal  zeros  of  D{z)  on  or  closest  to  the  unit  circle  be 
denoted  as,  z,-  =  which  are  found  by  rooting  D{z), 

In  practice,  the  polynomial  D{z)y  defined  in  (V.9),  is  formed  with  the  coefficient  vector  d  estimated  using 
(V.12).  Since  d  is  a  function  of  observations,  any  error  in  d  would  be  due  to  deviations  or  noise  in  the  observa¬ 
tions.  Error  in  estimated  d  would  affect  the  estimated  roots,  z*  and  that  in  turn  would  introduce  errors  in  the 
corresponding  ct^j’s  as  well  as  in  the  0j’s.  Hence,  in  order  to  analyze  the  bias  and  MSE  of  the  AO  As,  we  need  to 
relate  these  errors  all  the  way  back  to  the  error  in  the  MNM  coefficient  vector  d.  Hence,  we  begin  by  relating  the 
AO  A  errors  to  signal  zero  errors  which  are  then  related  to  the  coefficient  errors. 

VI.  1  :  Relationships  Between  the  Errors  in  the  AO  A  Estimates  and  the  Signal  Zeros 


From  (II. 5)  iand  the  definition  of  D{z)  in  (V.9)  we  know, 


Zi  = 

{VLla) 

and, 

zl  = 

(VLlb) 

Hence,  we  can  write, 

dzi  2ird  .  ••  jjj,  ^ . 

Azi 

Afi 

{VI.2) 

and. 

Azi  =  j^cos0ie^^^'^'>‘A6i. 

A 

(VIM) 

Similarly  for  z,* , 

Az^  =  -j^coseie-^^^''''>‘Aei. 

A 

{VIM) 

Hence,  the  bias  errors 

in  AOAs  have  the  following  linear  relationship  with  the  corresponding 

signal  root  errors. 

E{A0i)  =  -  {VIA) 

zTtd  cos  Bi 


where  ^E(  )’  is  used  to  denote  the  expectation  operator.  Furthermore,  using  (VI. 3)  the  MSE  of  the  AOAs  can  be 
shown  to  be  related  to  the  signal-root  MSEs  as. 
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Equation  (VL4)  and  (VI, 5)  show  that  the  bias  and  mean  squared  error  in  the  AOA  estimate  are  linear  to  those 
in  the  signal  zeros.  Next,  the  bias  and  MSE  of  signal  zeros  are  related  to  the  coefficient  errors. 


VI. 2  :  Effect  of  Coefficient  Error  on  the  Bias  and  Mean- Squared  Error  of  the  Zeros 


The  effect  of  errors  in  the  coefficients  dk  on  the  zeros  of  D{z)  is  well  known  [11]  and  is  given  by  (note  that, 

do  =  1), 


dzi 


-(fc-i) 


n;i7,U-  (1  - 


k  =  1. 


{VL6) 


Using  (VI. 6),  the  total  error  in  Zi  due  to  error  in  all  coefficients  is  given  by. 


k=l 


-1 


nS#.- 


— [1  ...  Zi  ^^]Ad' 


-y/W^ 


— V^(eJ"-)Ad' 


where,  d'  is  defined  in  (V.IO)  and 


V(e^“‘)  A  V(2.)  A  -7=i==[l  ••• 

=  —  V  A  —  1 


Hence,  the  bias  error  of  the  estimated  signal  root  is  given  by, 


E{Azi)  — 


nr=T.v 


_V^(e-'"‘)£'(Ad') 


The  mean  squared  error  is  given  by. 


Ei\Az,\^)  = 


iV-1 


nr=T,W 

A  S  V^{e^'^')E(Ad'Ad'")V(e^^*) 


_V«(eJ"‘)i^(Ad'Ad''^)V(e^"0 


{VI.7a) 

{VI.7b) 

(VI.S) 

iVI.9a) 

{VI.9b) 


where. 


S  A 


N-1 


(VI.9c) 


denotes  the  sensitivity  of  the  parameter  set  and  is  a  measure  of  the  effect  of  errors  in  the  parameter  vector  d 
on  the  signal  zeros.  Equations  (VI. 8)  shows  that  the  bias  in  the  signal-zeros  is  linearly  related  to  the  bias  in 
the  coefficient  vector  and  equation  (VI. 9)  provides  the  relationship  between  the  MSE  of  signal  roots  and  the 
coefficient  error  covariance  matrix.  The  expressions  for  £'(Ad')  and  £^(Ad'Ad'-^)  are  derived  next. 


VI.3  :  Coefficient  Error  due  to  the  D-MNM  Method 

The  analysis  in  this  part  would  rely  on  the  assumption  made  at  the  outset;  the  observation  data  consists  of 
p  complex  sinusoids  in  additive  white  Gaussian  noise  2r„(m).  Rewriting  explicitly  the  observation  matrix  defined 
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in  (11.12), 


where, 


X  A  [Xi  X2  ...  xm] 

/  a;i(0)  X2(0)  ...  iCM(O)  \ 

a:i(l)  a;2(l)  ••• 

A  .  ... 

\a;i(A^  — 1)  X2{N—1)  ...  xm{N-1)I 


(l^/.lOa) 


p 

xm{n)  A  +  ^m(n). 

«=i 


(vi.m) 


Note  that  the  phase  in  Aim  which  is  defined  in  (VI.6)  is  assumed  to  be  uniformly  distributed.  In  addition, 

the  theoretical  autocorrelation  matrix  C  in  (IV.3)  can  be  written  explicitly  as. 


C  A  E{XX^) 

_  TST"  +  all 

/  £'(x(0)a;*(0))  i;(a:(0)a;*(l))  ...  i;(a;(0)x*(iV  - 1))  \ 

(  i;(x(i)x*(o))  i;(x(i)x*(i))  ...  f;(x(i)x*(7v  - 1)) 


E{xiN-l)x*{0))  Eix(N-l)x*{l))  ...  E(x{N  -l)x*{N  -l))J 


(7(0) 

(7(-l) 

(7(-2)  ... 

C{-{N-1))\ 

Cil) 

(7(0) 

(7(-l)  ... 

Ci-{N-2)) 

(7(2) 

(7(1) 

C(0)  ... 

C(-(Ar-3)) 

(VLll) 

\ciN-l) 

C{N  -  2) 

(7(N-3)  ... 

(7(0)  J 

where, 


f  Ef=i  k.-l"  +  if  =  0; 

lELi  if m  7^0. 


From  Section  V.a  we  know  that  the  D-MNM  method  uses  the  following  decomposition. 


F  A  CD 

[Fs 

A  [fi  f2  ...  fp  ifp+i  ...M 

—  [C©!  C62  . . .  C©p  .  COp+l  . . .  C©7sz] 

=  [CDs  i  CBn] 


{VI.12) 

{Vl.Ua) 

(Vl.Ub) 

(VLUc) 

(Vl.nd) 

(VLUe) 


where  p  is  the  assumed  number  of  signal  sources.  Ds  contains  the  signal  subspace  DFT-vectors  e*  which  cor¬ 
respond  to  the  largest  norm  f,-  vectors.  In  D-MNM,  the  vector  d  is  entirely  in  the  noise  subspace  and  must  be 
orthogonal  to  the  signal  subspace  Fs,  i.e.,  repeating  from  (V.ll) 

Ff  d  =  0,  (Vl.Ua) 
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or,  using  (VI.  13), 

DfCd  =  0, 

Using  (V.IO)  and  (V.12b), 

Gd'  =  -  g 

and  the  error  expression  can  be  written  as, 

A(GdO  =  -  Ag. 

Using  chain-rule, 

AGd'  -f-  GAd'  =  -Ag. 

The  pseudo-inverse  solution  for  the  coefficient  error  is. 

Ad'  =  -  G’*‘(AGd'  -b  Ag) 
=  -  G*{Ag  :  AG) 

=  -G#AFfd 


(V/.146) 

(V7.15) 


(V/.16) 
(VI. 17) 


(VJ.18) 


where  G^  denotes  the  Moore- Penrose  pseudo-inverse  of  G.  Equation  (VI.  18)  shows  that  the  error  in  the  coefficient 
vector  depends  directly  on  the  error  of  the  signal  subspace  obtained  by  using  the  D-MNM  method.  The  coefficient 
error  in  (VI.  18)  can  now  be  utilized  in  the  expressions  for  bias  and  mean  squared  error  of  the  zeros  which  were 
derived  in  (VI. 8)  and  (VI.9),  respectively.  However,  according  to  (VI. 8)  and  (VI. 9),  in  order  to  obtain  E(Azi) 
and  E{\Azi\'^)  we  need  expressions  for  E(A')  and  E(A'A'^),  which  are  derived  in  the  next  two  sections. 


VI.4  :  Bias  in  the  Signed  Zeros 

First  let  us  find  the  mean  of  the  coefficient  error, 

E(Ad')  =  -G#£’(AFf)d 

=  -G#F;(DfAC")d;  using(V7.13) 

=  -  G’^£'(Df  C)d-|-G’^DfCd;  assumingAC  =  C  -  C 
=  -G’*‘DfF;(C)d  using(V7.14) 

—  G#DfCd  if  covariance  is  unbieised 
=  0;  using(VI.14). 

(V7.19) 

Substituting  (VI.19a)  in  (VI. 8)  results  in, 

E(Azi)  w  0.  (V7.20) 

This  expression  implies  that  the  bias  in  the  estimate  obtained  by  using  the  D-MNM  method  can  be  expected  to 
be  quite  small.  This  fact  was  observed  in  simulations  also. 

VI.5  :  Mean  Squared  Error  in  the  Signal  Zeros 
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For  obtaining  an  estimate  of  the  mean  squared  error  in  the  zeros,  an  expression  is  needed  for  £'(Ad'Ad'^) 
which  appears  in  (VI.9).  Starting  with  (VI.18),  we  have, 

F;(Ad'Ad'^) 

=  G*E{AF^dd^AFs){G^)* 

=  G#DfF?(AC"dd"AC)Ds(G")’^ 


(KJ.21) 


where, 

£;(AC^dd^AC) 

=  E{{C^  -  C")dd^(C  -  C)) 

=  F;(C"dd"C)  -  £;(C^dd"C)  -  F;(C^dd"C)  +  i;(C"dd^C) 
=  £;(C"dd"C)  -  C^dd^C 
and  from  (11.13),  we  know, 

£'(C^dd^C) 

-  M  .  M 

=  £((iE 


(FJ.22) 


f , 
»  =  1 


j=i 


.  M  M 

=  172  EE  ^(Xixfdd^x,xf) 


(l"/.23) 


1=1  i=i 


Since  this  expression  involves  fourth  moments  of  the  process,  its  computation  would  be  difficult  in  general. 
However,  for  a  Gaussian  random  process  all  higher-order  moments  can  be  expressed  in  term  of  first  and  second 
moments.  In  particular,  if  vi,  1^2?  V4,  are  complex  Gaussian  random  variables,  it  is  known  that  [40], 

E{viV2VsvX)  =  E{vivl)E{vsvl)  -h  E{vivl)E{v3V2)  {VL2A) 

Applying  (VI.24)  to  (VI. 23)  leads  to  the  following  expression, 

£’(C^dd^C) 

-  M  M 


1=1 J=1 


ijgk 


=  iipE 


a:.(l)  \ 

/dl\ 

xj(0)  \ 

; 

(xt(0)...)  :  (dj...) 

\x,{N)J 

Vdjv/ 

[xjiN-1)) 

/  F;(a;,(0)a:J'(%j(flr)2;;(0)) 


(xt(0)...xt(Ar-l))} 
E{xi{0)xtik)x^ig)x*^{N-l))  \ 


\E(xi{N  -  l)x^(k)xjig)x*jm  . . .  E{xiiN  -  l)x*,ik)xj{g)x;{N  -  1))/ 

EixiiO)x^ik))E{xjig)x]{0))  . . .  E{xi{0)xKk))Eixj{g)x*jiN  -  1)) 


V E{xiiN  -  l)x*i{k))E{xj{g)x*j{0))  . . .  E{xi{N  -  l)x*i{k))E{xjig)x^{N  -  1)) , 

/  E{xi{0)x*j{0))Eixj{g)xnk))  ...  E{xi{0)x*^{N  -  l))E{xj{g)x*i{k)) 

+  ; 

\E(xi{N-l)x*j{0))E{xj{g)x*i{k))  ...  E{xiiN  -  l)x;{N  -  l))E{xj{g)x:ik)) , 

/  £;(:rK0)^i(0))  •••  E{xi{0)x;iN  -  1)) 


=  C^dd^C  + 


ijgk 


{VI.25) 


\E{xi{N- l)x]{0))  ...  EixiiN-l)x^iN-l)), 
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where 


M  M  N  N 

i:  aeese 

ijgk  »=1  j=l  5=1  k  —  l 


Substituting  (VI.25)  in  (VI.22)  results  in, 

i;(AC^dd^AC) 


E{xiiO)xm) 


EixiiO)x]{N  -  1)) 


£;(x.(Ar-l)xj(0))  ...  E{xiiN-l)x*jiN-l)) 


{VI.26) 


EXirther,  if  the  noise  is  white  and  the  complex  amplitudes  of  the  signal  are  uncorrelated,  then  if  i  ^  j, 
E{xj(g)x^(k))  =  0.  In  (VI.26),  among  possible  i,j  combinations,  there  are  only  M  terms  for  which,  i  =  j. 
Retaining  only  these  terms  we  have,  (assuming  stationarity,  the  dependence  on  i  {=j)  has  been  suppressed) 


EiAC^dd^AC) 


g=lk=l 


E{x{0)x*(0)) 


E{x(0)x*(N  -  1) 


i;(x(Ar  -  l)x*(0))  ...  Eix{N -l)x*{N -1)) 


5=1 k=l 


{VI.27a) 


{VI.27b) 


5=1  fc=l  «  =  1 


in  the  above,  Sgk  is  the  Kronecker  delta  and  hence, 


E{x(g)x*{k))  = 


C'(O)  =  E<=i  l«»P  +  ifflr  =  *, 

C{g  -k)=  Ef=i  |a,f 


{VI.27c) 


Substituting  (VI. 27b)  in  (VL21), 


i;(Ad'Ad'")  =  +  <T25,t)c)Ds(G^)#. 


{VL2S) 


5=lfc=l  *=1 


Using  this  in  the  expression  of  the  MSE  of  signal  zeros  in  (VL9b), 


E{\Azi\^)  =  5V"(e^'“0G#Df(^f^Ed,d;(El«i|V“‘(^-*)  +  <r25,,)c)  D5(G«)#V(e^“0  (VJ.29) 


g=l  fc=l  »=1 


Now  we  are  ready  to  obtain  the  final  expression  for  the  bias  and  MSE  of  the  DOA  estimates. 
VI.6  :  Bias  in  the  AOA  Estimates 

Using  the  developments  in  (VI.19)-(VI.20)  into  the  AOA  bias  equation  (VI.4), 


E{Aei)  =  0. 


(V7.30) 
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VI.  7  :  Mean  Squared  Error  in  the  AO  A  Estimates 


Using  (VL29)  in  (VL5),  we  finally  obtain  the  MSE  expression  of  AOAs, 


N  N 


Ei\A0i\’^)  =  +  <7^5,.)^)  D5(G")#V(e^-0 

ZTTd  cos  Ui  \  M 

(U7.31) 

This  expression  explicitly  shows  that  the  mean  squared  error  in  the  AOAs  not  only  depends  on  the  parameter 
sensitivities  but  also  on  the  specific  structure  in  the  D-MNM  method.  It  should  be  mentioned  here  some  of 
developments  presented  here  are  similar  to  the  work  in  [25].  However,  the  results  in  [25]  depends  heavily  on 
statistical  properties  of  eigenvalues/eigenvectors  which  are  neither  applicable  not  appropriate  in  the  present  case. 
Hence,  in  the  development  above  the  moments  of  the  data  and  covariance  have  been  used  directly.  Simulation 
studies  with  similar  data  sets  used  in  [25]  verify  close  match  between  the  theory  and  simulation  results. 


VII :  Simulation  Results 

In  the  first  two  examples,  the  performance  of  the  proposed  D-MNM  algorithm  is  compared  with  the  perfor¬ 
mance  of  some  of  the  existing  well-known  algorithms  using  Monte-Carlo  studies.  In  Example  3,  the  theoretically 
derived  MSE  formulae  are  verified  by  comparing  theory  and  simulation  studies.  The  theoretical  performance  is 
also  compared  with  some  of  the  eigen-based  counterparts. 

Vll.a  :  DO  A  Estimation 

Simulation  1  :  Two  Closely- Spaced  Targets  of  Equal  Powers  [15,  38,  39] 

Planewaves  from  p  =  2  sources  with  Oi  =  IS""  and  62  =  22""  incident  on  N=8  sensors  were  modeled  as  in  [15, 
16,  18].  The  number  of  snapshots,  M=10.  Fig.  1  shows  the  norms  of  the  fi  vectors  for  20  trials  at  20dB  SNR. 
The  two  largest  AjS  always  appear  to  be  more  significant  than  all  the  smaller  ones.  Figures  2a  and  2b  show  the 
roots  of  D(z)  for  50  independent  realizations  using  D-MNM  and  E-MNM,  respectively.  The  figures  show  that 
the  roots  in  both  cases  are  at  almost  same  locations.  Table- 1  compares  E-MNM  and  D-MNM  in  terms  of  the 
bias  and  RMS  values  with  200  independent  trials  at  different  SNR  values.  The  results  clearly  indicate  that  the 
performance  of  D-MNM  is  quite  close  to  that  of  E-MNM,  though  no  Eigen  decomposition  was  required  in  this 
case.  In  fact,  D-MNM  was  found  to  be  somewhat  more  robust  (in  terms  of  successful  trials)  at  low  SNR  ranges. 

Vll.b  :  Frequency  Estimation 

In  this  Section,  the  proposed  algorithm  is  compared  with  the  well-known  Tufts- Kumaresan  (TK)  method  [17,  42] 
and  MUSIC  method  [1,  30]  via  simulations. 

Simulation  2  :  Comparison  of  High- Resolution  Performance  and  Threshold  Enhancement 

The  simulation  data  is  generated  using  the  formula  [42]  : 

y(n)  =  axe^‘2.(0.5)n+if  ^  a2e^2.(o.52)n  ^  for,  n  =  0,  1,  ...,  M-1  (VILl) 

where,  w{n)  is  complex  white  Gaussian  noise  with  variance  cr^.  The  number  of  data  samples  used  is,  M=25.  This 
data  set  has  been  widely  used  in  the  literature  for  studying  the  performance  of  various  methods.  For  this  data 
set,  it  has  been  shown  in  [42]  that  the  TK  method  performs  best  when  high-order  {L  x  L)  covariance  matrix  with 
L  =  18  is  used  with  forward-backward  covariance  matrix  [42].  Five  hundred  independent  noise  realizations  were 
used  to  compare  the  performance  of  the  proposed  method  with  that  of  TK  method  and  MUSIC,  The  mean  values 
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for  three  cases  at  different  SNR  values  are  displayed  in  Fig.  4.  The  RMSE  results  are  shown  in  Fig.  5  along  with 
CR  Bound  for  the  frequency  at  /i  =  0.b2Hz,  The  bias  and  RMSE  at  different  SNR  values  are  also  tabulated 
in  Table  2.  Clearly,  the  proposed  method  extends  the  performance  threshold  closer  to  the  CR  bound.  Hence  the 
performance  of  the  proposed  method  approaches  that  of  the  Maximum-Likelihood  method  more  closely,  although 
with  considerably  less  computational  complexity. 

Vll.b  ;  Perturbation  Analysis  of  D-MNM 

In  this  Section,  the  theoretical  formulae  derived  for  MSE  and  Bias  for  the  DOA  estimation  using  the  proposed 
DFT-bcLsed  algorithm  are  verified  by  comparing  theoretical  and  simulation  results. 

Simulation  3  :  Peviurbation  Analysis  of  .40^15  with  Small  Number  of  Sensors 

The  problem  scenario  is  identical  to  as  described  in  Simulation  1  for  AO  A  estimation.  For  this  case, 
A  =  8,M  =  100,  and  the  AO  As  are  18®  and  22®.  In  these  simulations  the  theoretical  formulas  for  mean- 
squared  error,  as  derived  in  (66)  and  (67),  are  compared  with  the  performance  using  simulated  data.  The  results 
for  various  signal-to-noise  ratios  (SNR)  are  shown  in  Fig.  6  which  shows  the  result  corresponding  to  the  AO  A  of 
18®  using  all  of  the  200  independent  trials. 

Note  that  data  set  considered  in  Simulation  3  were  also  used  in  [25]  for  perturbation  analysis  of  the  eigen- 
based  MNM  and  MUSIC.  But  it  appears  that  in  [25],  only  the  results  with  successful  trials  were  plotted.  Fig. 
6  indicates  that  for  this  example,  the  D-MNM  method  appears  to  have  smaller  squared  error  than  E-MNM, 
especially  at  low  SNR.  This  was  found  to  be  the  case  for  all  trials.  In  fact,  the  success  rate  was  higher  for 
D-MNM  when  compared  with  E-MNM.  Following  the  trend  in  the  eigen-based  cases,  Root-MUSIC  fared  a  little 
better  (except  at  low  SNR),  but  it  may  be  noted  that  in  case  of  Root-MUSIC  the  polynomial  to  be  rooted 
has  twice  the  order  than  either  D-MNM  or  E-MNM.  Finally,  it  may  be  emphasized  here  that  the  theoretical 
predictions  based  on  formulas  derived  in  this  paper  were  found  to  be  quite  close  to  those  obtained  by  computer 
simulations. 

VIII  :  Analysis,  Discussion  and  Directions  on  Further  Research 

The  results  presented  so  far  are  quite  intriguing  and  can  may  possibly  have  some  important  consequences  on 
simplifying  the  present  practice  of  frequency /DO  A  estimation.  The  proposed  approach  of  forming  signal-subspace 
using  DFT  without  any  eigendecomposition  also  opens  up  whole  new  avenues  for  further  research  and,  at  the 
same  time,  poses  some  unanswered  questions.  Furthermore,  it  may  be  possible  to  extend  and  incorporate  similar 
ideas  in  other  closely  related  problems  or  to  develop  more  simplified  algorithms.  Clearly,  the  major  advantage  of 
the  proposed  approach  is  that  all  the  signal-subspaces  are  obtained  with  a  single  matrix  multiplication.  This  step 
may  be  performed  using  FFT  which  is  very  efficient  for  hardware  and  software  implementation.  In  the  following, 
some  analysis  as  well  as  some  possible  directions  for  further  research  are  briefly  discussed. 

1.  Reduced  Computational  Complexity  and  Usefulness  in  High  Sampling-Rate  Problems  :  The 
major  significance  of  D-MNM  is  that  its  high-resolution  capability  does  not  rely  on  any  iterative  method 
or  eigendecomposition  which  is  also  computed  iteratively.  The  lower  computational  complexity  of  D-MNM 
should  be  attractive  in  any  general  frequency /DO A  estimation  scenario.  But  the  usefulness  of  the  proposed 
method  should  be  specially  significant  in  those  applications  where  traditional  high-resolution  methods  are  yet 
to  make  much  inroads  due  mainly  to  extremely  high  sampling  rate  requirements.  Specifically,  in  Electronic 
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Warfare  (EW)  applications,  the  signals  usually  operate  in  the  GHz  range  but  real-time,  high-resolution 
capability  is  a  necessity  [41].  Currently  no  EW  receiver  processes  signals  entirely  in  digital.  The  proposed 
DFT-based  MNM  with  its  low  computational  complexity,  is  expected  to  provide  the  desired  high-resolution 
capability  to  future  digital  EW  receivers. 

2.  Signal-Subspace  Information  from  the  Autocorrelation  Matrix  Only  :  The  strength  of  the 
Minimum-Norm  framework  as  a  high-resolution  method  really  comes  from  its  ability  to  form  the  ‘noise- 
subspace’  vector  d  by  exploiting  the  orthogonality  property  in  (V^.  11).  It  appears  that  as  long  as  F5  has 
some  component  of  the  signal-subspace  T,  the  solution  of  (Kll)  would  retain  its  high-resolution  capabil¬ 
ity.  The  DFT-of-AC  is  an  appropriate  candidate  to  produce  F5  because  it  is  a  linear  combination  of  the 
signal-vectors  in  T.  This  can  be  seen  by  rewriting  the  DFT-of-AC  matrix. 


F  =  CD 


M 


M 


am(x^D) 


m=l 


(VIILl) 


In  fact,  the  AC  matrix  itself  is  also  a  possible  candidate  for  obtaining  the  ‘signal-subspace’  F5,  because  it 
can  be  expressed  as  a  linear  combination  of  the  signal- vectors  in  T, 


r  1  M 


{VIII.2) 


Theoretically,  the  norm  of  each  vector  in  ideal  C  should  be  equal  but  with  noisy  C,  the  norms  of  some  of 
the  vectors  may  be  reduced  while  for  other  vectors,  the  norms  may  be  more  than  the  nominal  value.  Hence, 
the  ideal  choice  would  be  to  pick  the  vectors  with  norms  in  the  middle  range.  Not  surprisingly,  when  F5  is 
formed  in  this  manner  with  p  vectors  of  the  estimated  C,  MNM  again  demonstrated  high-resolution  capability 
in  simulations  (not  included).  This  simpler  procedure  to  obtain  ‘signal-subspace’  information  needs  to  be 
studied  further.  But  it  must  be  stated  that  D-MNM  performs  better  at  low  SNR  because  the  DFT  operation 
accentuates  the  signal-subspace,  as  discussed  next. 

3.  Asymptotic  Analysis  of  the  DFT-based  Signal  Subspace  for  Arbitrary  DOA/Frequency  :  For 
ideal  noise-free  observations  if  the  frequencies  are  not  on  the  DFT  bins,  the  DFT-of-AC  operation  can  be 
expressed  as  : 


F  = 


CD 
=  TST"D 
rtfD 

H 
2 


=  TS 


t?D 


tfD 


{Vlll.Sa) 

{Vlll.Zh) 

(VIII.Zc) 


Consider  the  matrix  at  right.  Each  of  the  tf  D  vectors  are  complex  valued  DFT  of  a  sequence  of  a  complex 
sinusoid.  The  magnitude  of  each  row  vector,  tf  D  has  a  Sine  envelope  with  a  peak  occurring  at  the  column 
corresponding  to  the  bin  location  closest  to  the  frequency  w*.  For  infinite  aperture  with  N  00,  t.e.,  for 
large  number  of  sensors,  each  row  vector  peaks  at  w,  and  the  other  elements  of  that  row  approaches  zero. 
The  same  will  be  the  case  for  each  of  the  other  row  vectors  also.  Hence,  asymptotically,  the  DFT-of-AC 
operation  again  produces  p  largest  norm  vectors  at  the  true  signal  frequencies.  The  asymptotic  analysis 
for  the  noisy  case  as  defined  by  (V.7)  would  also  provide  similar  results.  For  finite  N,  because  of  the  Sine 
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weighting,  the  largest  norm  vectors  will  also  have  contributions  from  some  other  tj  vectors  in  the  T.  But 
those  components  also  contain  signal-subspace  information  which  is  orthogonal  to  d  and  hence  useful  for 
obtaining  the  minimum-norm  vector  d. 

4.  Estimation  of  the  Parameters  of  Damped  Sinusoids  in  Noise  :  Many  eigen-based  methods  have  been 
successfully  utilized  in  estimating  the  unknown  parameters  of  damped  sinusoids  from  noisy  observations  [5, 
14].  It  appears  that  with  some  simple  modifications  the  proposed  DFT-based  approach  could  also  be  used 
for  the  same  purpose.  The  advantage  would  again  be  that  no  eigendecomposition  but  the  performance  will 
be  comparable. 

5.  Largest  Norms  vs.  Peaks  :  In  all  the  simulations  presented  here,  the  signal  subspaces  have  been  formed 
by  selecting  the  p  unit- vectors  having  largest  norms.  But  the  ideal  solution  may  be  to  pick  the  unit  vectors 
corresponding  to  the  p  largest  peaks  (having  smaller  norm  vectors  on  both  adjacent  bins).  This  may  eliminate 
any  possibility  of  picking  multiple  vectors  from  the  vicinity  of  strong  signals.  It  should  be  emphasized  though 
that  largest  norm  criteria  has  worked  quite  well  so  far,  as  demonstrated  by  a  large  number  of  simulations. 
But  this  aspect  certainly  needs  further  analysis. 

6.  Zero  padding  :  In  classical  spectral  estimation,  Periodogram  relies  on  DFT/FFT,  but  it  is  often  necessary 
to  extend  (or,  pad)  the  available  data  with  zeros  so  that  interpolated  values  between  available  bins  can  be 
calculated.  Zero-padding  is  also  used  to  extend  data-lengths  to  powers  of  two  such  that  the  computational 
efficiency  of  the  FFT  can  be  taken  advantage  of.  In  the  simulation  studies,  no  zero-padding  had  been 
incorporated  so  far.  It  is  not  quite  apparent  whether  the  zero-padding  should  be  done  directly  to  the  data 
or  to  the  covariance  estimates  and  this  aspect  needs  further  study.  It  would  also  be  necessary  to  study  the 
possible  effects  on  the  signal-subspace  produced  by  the  DFT-of-AC  operation  after  zero-padding  is  introduced. 

7.  Windowing  :  In  classical  spectral  estimation,  in  order  to  avoid  sudden  discontinuities,  the  observed  data  is 
often  weighted  (or  tapered  at  both  ends)  by  non- rectangular  window  which  tends  to  enhance  the  dynamic 
range’  at  the  cost  of  ‘resolution’.  In  the  simulation  results  presented  here,  no  windowing  has  been  used. 
But  windowing  is  known  to  be  highly  effective  in  locating  weak  frequency  components  which  tend  to  get 
submerged  by  the  sidelobes  of  strong  components.  Though  it  is  believed  that  that  orthogonality  property 
in  (V.ll)  is  the  main  contributing  factor  for  the  high-resolution  capability  of  D-MNM,  it  would  certainly  be 
interesting  to  study  what  effects  windowing  might  have  on  the  performance  of  D-MNM. 

8.  Use  of  DFT-Based  Signal- Sub  space  in  other  Eigen-Based  Methods  :  Other  than  the  Minimum- 
Norm  Method  covered  in  this  paper,  there  is  a  large  body  of  work  where  some  form  of  eigendecomposition 
is  utilized  to  estimate  DOA/Frequencies  [1,  2,  8,  10,  13-15,  19,  22,  23,  30,  32,  36,  42,  43,  44-47,  49].  Among 
the  more  important  results  are,  MUSIC  [30],  SVD  [15,  42]  and  ESPRIT  [22].  It  is  quite  possible  that  the 
proposed  DFT-based  signal-subspace  may  be  incorporated  with  some  of  these  existing  eigendecomposition 
based  methods,  in  order  to  implement  those  methods  without  eigendecomposition.  Clearly,  the  proposed 
approach  can  be  used  to  implement  MUSIC,  except  that  the  noise  subspace  F^v  defined  in  (U.8c)  would  have 
to  be  utilized.  Also,  the  left  and  right  eigenvectors  of  the  SVD  of  a  data  matrix  are  actually  the  eigenvectors 
of  correlation  matrices.  Hence,  it  appears  that  some  of  the  SVD- based  approaches  may  also  be  modified 
to  incorporate  DFT-bsised  signal/noise  subspaces.  Care  should  be  taken  about  the  choice  of  either  the  left 
or  right  signal-spaces,  because  both  may  not  contain  signal  information.  The  case  is  not  so  apparent  for 
those  methods  which  use  generalized  eigendecomposition  [22,  36,  45].  As  part  of  this  paper,  some  of  these 
possibilities  will  be  further  investigated. 
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9.  DFT-Prony  :  There  has  been  some  recent  interest  in  implementing  the  Prony’s  algorithm  in  the  Frequency- 
Domain  [29].  Clearly,  the  signal- vectors  in  Fs  can  be  treated  as  multiple  time-series  to  form  a  (p-f  1)  x  (p-h  1) 
covariance  matrix  (using  forward-backward  approach)  and  then  the  p-th  order  Prony’s  polynomial  can  be 
estimated.  Based  on  preliminary  simulations  (not  included),  this  approach  appears  to  be  simplest  of  all 
existing  methods  with  moderately  good  high-resolution  performance.  The  performance  of  DFT-Prony  is 
much  better  than  that  of  the  standard  Prony’s  method  because  the  DFT-based  signal  subspace  is  cleaned-up 
though  without  any  eigendecomposition.  These  ideas  will  be  further  studied  as  part  of  this  paper. 

10.  Two-Dimensional  Frequency- Wavenumber  Estimation  :  In  some  array  processing  scenarios,  both 
the  DO  As  (related  to  wavenumbers)  and  the  center  frequencies  need  to  be  estimated.  Many  existing  1-D 
eigen-based  methods  have  been  extended  to  2-D  to  address  this  problem.  It  appears  that  the  DFT-of-AC 
vectors  can  be  formed  in  both  domains  and  two  D{z)  polynomials  can  be  be  formed  to  estimate  the  the 
frequencies  and  DOAs  separately.  Incorporation  of  the  DFT-based  signal-spaces  for  2D  frequency  estimation 
will  be  further  investigated  as  part  of  this  paper. 

11.  Hardware  Implementation  :  Perhaps  the  most  important  and  useful  practical  impact  of  the  proposed 
method  would  be  in  the  area  of  hardware  implementation  for  high-resolution  Direction-of- Arrival  or  frequency 
estimation.  All  the  currently  available  methods  with  good-enough  high-resolution  capability,  rely  on  some 
form  of  iterative  optimization  or  iterative  computation  of  eigenvectors.  In  contrast,  all  that  the  proposed 
approach  requires  to  form  the  ‘signal-subspace’  is  a  single  matrix  multiplication.  Furthermore,  the  matrix  to 
be  multiplied  is  a  DFT  matrix  and  it  has  special  structures  so  that  FFT  based  processing  may  be  utilized  to 
further  reduce  the  computational  burden.  Hence,  one  of  the  major  goals  of  the  proposed  work  would  be  to 
devise  appropriate  strategies  to  design,  develop  and,  if  possible,  fabricate  VLSI  hardware  for  high-resolution 
DOA/Frequency  estimation. 


Fig.l.  Norms  of  the  DFT-of-AC  vectors 
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Fig. 2.  Roots  of  D{z)  using  (a)  E-M!NM  and  (b)  D-MNM  for  50  independent 
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SNR 

Successful  Trials 

1  Bias  (in  degrees) 

(in  dB) 

D-MNM 

E-MNM 

D-MNM 

E-MNM 

D-MNM 

E-MNM 

5 

59 

39 

-0.8480 

T.1589 

-0.5539 

0.4329 

1.4311 

1.9174 

1.3623 

1.9322 

10 

139 

130 

-0.3154 

0.8940 

-0.4589 

0.7603 

1.3529 

1.7910 

1.5063 

1.8571 

15 

191 

189 

-0.0714 

0^4623 

-0.1094 

0.3648 

0.9812 

1.3118 

1.0021 

1.3212 

20 

199 

198 

-0.0055 

0.1717 

-0.0252 

0.1170 

0.6777 

0.8440 

0.6822 

0.8017 

25 

‘  200 

200 

4.99e-4 

0.0611 

-0.0067 

0.0481 

0.4129 

0.4820 

0.4302 

0.4826 

30 

200 

^00 

0.0037 

0.0263 

0.0018 

-^0,0219 

0.2297 

0.2728 

0.2329 

0.2737 

Table  1  :  Comparison  of  performance  of  D-MNM  and 
MNM. 
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RMS  1 

(in  dB) 

D-MNM 

E-MNM 

MUSIC 

D-MNM 

E-MNM 

MUSIC 

0 

-0.0349 

0.0352 

-0.1205 

0.1118 

-0.1178 

0.0983 

0.0876 

0.0786 

0.1783 

0.1748 

0,1594 

0.1486 

3 

-0.0133 

0.0141 

-0.1029 

0.1027 

-0.0681 

0.0678 

0.0415 

0.0468 

0.1654 

0.1640 

0.1312 

0.1265 

5 

in 

-0.0964 

0.0838 

-0.0343 

0.0392 

0.0232 

0.0342 

0.1476 

0.1378 

12 

■ 

ilfei 

-0.0754 

0.0658 

-0.0063 

.  0.0111 

12 

1 

12 

15 

tBii 

lAM 

20 

-3.40e-6 

-7.76e-5 

-1.04e-5 

6.34e-5 

1.61e-5 

-5.23e-5 

mjMH| 

MjMjM 

ni 

30 

8.64e-6 

-1.77e-5 

2.18e-5 

-9.01e-6 

3.35e-6 

-1.51e-5 

3.53e-4 

3.75e-4 

7.87e-4 

7.84e-4 

4,61e-4 

4.50e-4 

Table  2.  Comparison  of  Bias  and  RMS  values  for  three  methods 
with  500  independent  trials. 
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Section  -  2.2  :  MAXIMUM-LIKELIHOOD  ESTIMATION  OF  MULTIPLE  FREQUENCIES  WITH 

Constraints  to  Guarantee  Unit  Circle  Roots 
Summary 

A  recently  proposed  approximate  Maximum-Likelihood  Estimator  (MLE)  of  multiple  exponentials,  converts 
the  frequency  estimation  problem  into  a  problem  of  estimating  the  coefficients  of  a  z-polynomial  with  roots  at 
the  desired  frequencies  [1,  2].  Theoretically,  the  roots  of  the  estimated  polynomial  should  fall  on  the  unit  circle. 
But  MLE,  as  originally  proposed,  does  not  guarantee  unit  circle  roots.  This  drawback  sometimes  causes  merged 
frequency  estimates,  especially  at  low  SNR  [1,  3].  If  all  the  sufficient  conditions  for  the  z-polynomial  to  have 
unit  circle  roots  are  incorporated,  the  optimization  problem  becomes  too  nonlinear  and  it  loses  the  desirable 
weighted-quadratic  structure  of  MLE.  In  this  paper,  the  exact  constraints  are  imposed  on  each  of  the  Ist-order 
factors  corresponding  to  individual  frequencies  for  ensuring  unit  circle  roots.  The  constraints  are  applied  during 
optimization  alternately  for  each  frequency.  In  the  absence  of  any  merged  frequency  estimates,  the  RMS  values 
more  closely  approach  the  theoretical  Cramer-Rao  (CR)  bound  at  low  SNR  levels. 


I.  Introduction 

Estimating  the  underlying  parameters  of  multiple  complex  exponential  signals  in  noise  remains  a  vigorously 
researched  topic  in  signal  processing  [1-13].  For  a  single  sinusoid  or  when  the  multiple  frequencies  are  well- 
separated,  the  Periodogram  performs  reasonably  well.  But  if  the  frequencies  are  closely  spaced,  which  often  occurs 
when  the  data  length  is  limited  or  the  aperture  is  too  small,  the  Periodogram  fails  to  distinguish  the  frequencies 
and  produces  merged  frequency  estimates.  In  order  to  overcome  the  Periodogram’s  resolution  limitation,  many 
high-resolution  methods  have  been  developed  in  the  past  two  decades  [1-13].  In  contrast  to  the  Periodogram, 
these  methods  make  effective  use  of  some  underlying  property  of  the  true  sinusoidal  signal  model. 

Among  all  the  existing  high-resolution  frequency  estimation  methods,  the  MLE  appears  to  provide  the  most 
accurate  frequency  estimates  and  has  the  lowest  SNR  threshold  [1-4].  Other  high-resolution  methods  rely  on 
signal  or  noise  subspace  information  which  is  extracted  from  the  eigendecomposition  of  covariance  matrix  or  SVD 
of  data  matrix  [5,  7-11].  On  the  other  hand,  the  MLE  considers  the  exact  model  of  the  exponential  signal  and 
attempts  to  maximize  the  exact  likelihood  function  to  estimate  the  unknowns.  For  a  single  sinusoid,  the  peak  of 
the  periodogram  itself  corresponds  to  the  ML  estimate,  but  for  multiple  exponentials  the  MLE  turns  out  to  be  a 
nonlinear  optimization  problem  [1-6,  12,  13]. 

The  MLE  approaches  developed  independently  in  [1]  and  [2],  estimate  the  frequencies  from  the  roots  of  a 
z-polynomial.  It  may  be  noted  here  that  in  literature,  these  methods  are  sometimes  referred  to  as  KiSS  [1,  5, 
6]  or  IQML  [2].  In  the  polynomial  domain,  the  ML  optimization  problem  turns  out  to  be  quasi-linear  where  a 
weighted-quadratic  criterion  is  minimized  iteratively.  Though  effective  to  a  large  extent,  MLE  is  known  to  possess 
one  fundamental  drawback  :  the  optimization  procedure  in  [1,  2]  does  not  impose  sufficient  theoretical  constraints 
on  the  polynomial  coefficients  for  the  estimated  roots  to  fall  on  the  unit  circle.  The  primary  goal  of  this  work  is 
to  address  this  unresolved  problem  in  MLE. 

Two  conditions  must  be  satisfied  for  a  general  p-th  order  z-polynomial  to  have  p  unit  circle  roots  :  conjugate 
symmetry  (Cl)  and  a  derivative  constraint  (C2),  the  details  of  which  are  given  later.  In  MLE,  only  Cl  is 
imposed.  The  derivative  constraint  makes  the  problem  highly  nonlinear  and  hence,  C2  can  not  be  incorporated 
in  the  weighted-quadratic  framework  of  MLE.  But  when  p  >  1,  Cl  alone  is  not  sufficient  for  unit  circle  roots. 
Furthermore,  from  the  theory  of  Linear-Phase  FIR  filters,  it  is  well-known  that  the  roots  of  a  symmetric  z- 
polynomial  may  fall  either  on  the  unit  circle  or  they  may  be  in  reciprocal  pairs  falling  inside  and  outside  of 
the  unit  circle.  In  fact,  it  was  demonstrated  in  [1]  and  [3]  that,  if  SNR  <  lOdB  and  the  frequencies  are  spaced 
closely,  the  roots  extracted  by  MLE  sometimes  appear  in  reciprocal  pairs.  In  such  cases,  two  frequencies  merge 
to  produce  only  a  single  frequency  estimate.  The  alternate  approach  proposed  in  this  paper  attempts  to  alleviate 
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this  limitation. 

There  is  one'  exception  to  the  two  conditions  stated  above  :  for  p  =  1,  the  conjugate  symmetry  constraint 
(Cl)  alone  is  sufficient  for  the  single  root  to  fall  on  the  unit  circle.  This  is  the  main  idea  which  will  be  utilized 
in  developing  the  proposed  Constrained-MLE  (C-MLE)  algorithm.  Specifically,  Cl  will  be  imposed  on  each  of 
the  Ist-order  factors  of  the  p-th  order  z-polynomial,  such  that  each  individual  root  falls  on  the  unit  circle.  This 
process  need  not  be  applied  to  all  the  frequencies  at  all  SNRs.  The  constraints  are  imposed  only  on  those  Ist-order 
factors  which  produce  merged  frequency  estimates  at  convergence  of  MLE.  The  factors  for  which  the  roots  are 
already  on  the  unit  circle,  are  held  fixed.  The  proposed  algorithm  may  be  considered  to  be  a  polynomial-domain 
counterpart  of  the  'Alternating  Projection’  approach  [13]  where  the  ML  criterion  is  minimized  w,rd,  one  frequency 
at  a  time  while  the  other  frequencies  are  held  at  the  previously  estimated  values.  Our  work  appears  to  be  the 
first  attempt  to  guarantee  unit  circle  roots  on  the  polynomial  coefficients  for  Maximum-Likelihood  frequency 
estimation.  The  constraints  are  primarily  effective  at  low  SNR  levels  when  there  is  a  higher  possibility  for  MLE 
to  produce  merged  frequency  estimates.  In  simulations,  the  RMS  values  of  the  frequency  estimates  using  C-MLE 
were  found  to  be  closer  to  the  theoretical  CR  bounds  than  those  of  the  original  MLE  algorithm. 

The  paper  is  arranged  as  follows  :  In  Section-II,  the  ML  problem  is  stated  and  the  original  MLE  algorithm  is 
briefly  discussed  and  the  conditions  needed  for  unit  circle  roots  are  stated.  In  Section  III,  the  proposed  constrained 
version  of  MLE  is  introduced.  Simulation  results  are  given  in  Section-IV  to  verify  the  performance  of  C-MLE. 


II.  The  Maximum  Likelihood  Problem  and  a  Brief  Overview  of  MLE 

The  observed  samples  of  a  complex  multiple  exponential  signal  can  be  represented  as 


x(n)  A  +  z{n)  n  =  0, 1, . . . ,  A  —  1, 


(1) 


*=1 


where,  wjb,  c*  and  (f>k  are  the  unknown  angular  frequency,  amplitude  and  phase,  respectively,  of  the  sinusoid; 
p  is  the  assumed  number  of  sinusoids  and  z{n)  represents  i.i.d.  iV(0,  cr^)  Gaussian  noise  samples.  For  this  signal 
model,  the  MLE  corresponds  to  optimization  of  the  following  error  criterion  [1-4]  : 

(2) 
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afc  A  Cke^'^’’ ,  for  k  =  1,2,  respectively,  are  the  complex  amplitudes.  The  MLE  problem  stated  in  (2)  is 

a  nonlinear  optimization  problem  with  respect  to  the  angular  frequencies.  Instead,  MLE  forms  an  alternative 
but  equivalent  error  criterion  in  the  polynomial  coefficient  domain  which  has  a  quasi-linear  structure  which  is 
well-suited  for  iterative  optimization.  A  brief  summary  of  the  MLE  criterion  is  in  order. 

Let,  B{z)  A  6o  -I-  hz~^  -b  ...  +  bpZ~P,  be  a  degree  z-polynomial  with  p  roots  at  . . . 

respectively,  and  b  A  [6o  h  ■■■  be  the  coefficient  vector.  The  MLE  criterion  for  estimating  b  is 

[l]-[4]  :  ^ 

min  Eih)  =  b"x"(BB")-^Xb  where,  (4) 

{blLo 
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The  criterion  in  (4)  appears  to  be  quadratic  in  b,  except  that  the  weight  matrix  itself  depends  on  the  unknown 
coefficients.  Hence,  this  criterion  is  minimized  iteratively.  At  the  (k  —  l)-th  iteration 


min  ^V^Xjb 


(6) 


is  optimized,  where  the  weight  matrix  (BB^)  is  formed  using  the  estimate  of  b  found  at  the  previous  iteration. 
At  convergence  of  these  iterations,  the  frequencies  are  found  from  the  roots  of  the  estimated  polynomial  B{z). 
Unfortunately,  direct  optimization  of  the  criterion  in  (4)  does  not  guarantee  that  the  roots  of  B(z)  will  indeed 
fall  on  the  unit  circle  and  it  was  recognized  in  [1,  3]  that  two  conditions,  must  be  satisfied  to  guarantee  unit  circle 
roots  : 


Cl  :  The  coefficients  possess  conjugate  symmetry  : 

for,  fc  =  and, 


(7) 


C2  :  For  p  >  1,  the  derivative  of  B{z),  i.e.. 


B\z) 


(8) 


must  have  zeros  either  inside  or  on  the  unit  circle. 

The  polynomial  domain  MLE,  as  originally  proposed,  imposes  the  conjugate  symmetry  constraint  only  [1,  2]. 
C2  makes  the  optimization  problem  highly  nonlinear  and  the  weighted-quadratic  structure  of  (4)  is  lost  if  C2  is 
incorporated  in  the  algorithm.  Hence,  no  attempt  was  made  in  [1-4]  to  include  C2  in  the  algorithm.  But  if  p  >  1, 
Cl  is  not  a  sufficient  condition  for  unit  circle  roots.  The  same  condition  may,  in  fact,  lead  to  roots  in  reciprocal 
pairs  which  can  and  does  occur  in  MLE,  especially  at  low  SNR.  In  such  cases,  two  closely  spaced  frequencies  are 
estimated  as  a  single  frequency  only  [1,  3]. 

Important  Observation  :  For  p  =  1,  the  conjugate  symmetry  alone  is  a  sufficient  condition  to  ensure  unit- 
circle  root.  Hence,  we  propose  to  impose  Cl  sequentially  on  each  Ist-order  factor  of  B{z)  during  optimization  of 
(4).  In  that  case,  the  optimization  at  each  step  will  be  with  respect  to  only  a  Ist-order  factor  of  B{z)  and  hence, 
there  is  no  need  for  satisfying  C2. 


III.  Constrained  MLE  (C-MLE) 

The  p-th  order  polynomial  B{z)  can  be  expressed  in  factored  form  as  : 

B{z)  =  (9) 

where,  B^-'\z)  A  +  ...  +  b^Jppz-P+^  and  A  6^0^  +  b^i^z-\  are  (p  -  1)- 

th  order  and  lst-or3er  factors,  respectively.  If  conjugate  symmetry  is  imposecTbn  the  1st  order  factor,  then. 
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B(‘\z)  =  Note  that  in  (9)  the  coefficients  of  the  polynomial  B{z)  are  formed  as  the  convolution 

of  the  coefficients  of  B^’’-*\z)  and  Hence,  in  matrix-vector  notation  : 


b  = 


0 

t(p-0 

‘'0 

l(p-») 

r\ 

0 

Op-1 

A  A  Bp_iCbi,  (10) 

where,  Bp_,-  denotes  the  matrix-factor  with  the  i-th  1-st  order  factor  removed  and  6^,’^  A  b^o^  +  jb^o^.  Using 
(10)  in  (6),  each  Ist-order  factor  of  B{z)  is  estimated  by  optimizing, 

min  b.[C^Bf_/*"'^X"(B(‘-i)B"^*"'V^XB(‘_l')c]b.-, 


for,  i  =  1,2, ...,P- 


(11) 


This  is  a  weighted-quadratic  criterion  of  the  form  : 

l>fwji7^^bi  where. 


(12a) 


and 


(126) 


Note  that  the  weight  matrix  is  formed  with  the  estimates  found  at  (k  —  1)  —  th  iteration  step  when  the 

unconstrained  MLE  algorithm  is  assumed  to  have  converged.  The  criterion  in  (11)  can  be  optimized  sequentially 
or  concurrently  for  each  first  order  factor.  At  each  iteration,  bj  is  estimated  as  the  eigenvector  corresponding 
to  the  minimum  eigenvalue  of  G  The  advantage  of  using  (12a)  instead  of  (6)  is  that,  since  each 

B^'\z)  is  a  first-order  z-polynomial,  the  conjugate  symmetry  constraint  is  sufficient  to  guarantee  the  root  of 
B^'\z)  to  fall  on  the  unit  circle.  In  practice,  the  alternate  optimization  procedure  in  (11)  need  not  be  carried  out 
for  all  the  p  factors  of  B{z).  It  needs  to  be  invoked  only  in  those  cases  for  which  unconstrained  MLE  produces 
merged  frequency  estimates.  The  roots  which  are  already  on  the  unit  circle  need  not  be  optimized  further.  This 
sequential  process  guarantees  that  all  the  roots  of  B(z)  will  indeed  fall  on  the  unit  circle. 


IV.  Simulation  Results 

The  algorithm  described  in  this  paper  has  been  tested  with  the  same  simulated  data  set  used  in  [1]  and  [2] .  The 
following  formula  was  used  to  generate  the  data, 

a;(n)  =  -|-  +  z(n)  (13) 

n  =  0,  1,  , . . ,  24 

where,  wi  =  27r/i,W2  =  25r/2, /i  and /2  being  0.52  and  0.50,  respectively,  ai  =  1,  02  =  ^ ,  z(n)  is  a  computer 

generated  white  zero-mean,  complex  gaussianly  distributed  noise  sequence  with  variance  =  cr^,  i.e.,  \  is  the 
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variance  of  the  real  and  the  imaginary  parts  of  z{n).  SNR  is  defined  as,  10  logio(^^2”)-  Two  hundred  data  sets 
with  independent  noise  epochs  were  used. 

Fig.  la  and  lb  show  the  estimated  roots  for  200  independent  trials  of  MLE  for  SNR  =  5dB  and  lOdB, 
respectively.  Fig.  Id  and  le  show  the  corresponding  results  with  C-MLE.  For  the  lOdB  case,  Figures  Ic  and  If 
show  only  the  merged  cases  before  after  applying  the  exact  constraints.  The  unit  circle  roots  in  Fig.  If  does  show 
wider  spread  than  the  corresponding  merged  frequency  estimates  in  Fig.  Ic.  Fig.  2  compares  the  performance  of 
MLE  and  C-MLE  with  the  theoretical  CR  bound.  The  results  verify  that  C-MLE  performs  better  than  original 
MLE  at  low  SNR  range.  The  performance  of  C-MLE  has  also  been  compared  with  that  of  the  AP  method  [13]  and 
the  results  are  displayed  in  Fig.  3.  Clearly,  the  proposed  method  outperforms  the  AP  method  for  this  example, 
especially  at  low  SNR. 
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Fig.  1  .  Superimposed  plots  of  estimated  roots  for  200  independent  trials  using  MLE  (a-c)  and 
C-MLE  (d-f). 


Comparison  of  Performance 


Performance  comparison  of  C-MLE  and  AP  methods  with  the  theoretical  CR-bound. 
200  independent  trials  were  used. 


Section  -  2.3  :  IMPROVED  AR-PARAMETER  ESTIMATION  FROM  NOISY  OBSERVATION  DATA 

Summary 

Auto-Regressive  (AR)  modeling  is  the  most  widely  used  approach  for  model-based  spectrum  estimation. 
But  almost  all  the  existing  methods  for  AR-parameter  estimation  show  severe  degradation  if  the  observed  signal 
is  corrupted  with  noise.  In  fact,  all  the  commonly  used  techniques,  such  as,  Autocorrelation  Method  (AM), 
Covariance  Method  (CM),  Modified  Covariance  Method  and  their  variations,  give  poor  Power  Spectral  Density 
(PSD)  estimates  when  the  observations  are  noisy.  In  this  Section,  a  data- adaptive  pre-filtering  approach  is 
presented  to  address  this  problem.  Preliminary  results  indicate  that  when  only  noisy  data  is  available  for  modeling, 
the  proposed  technique  gives  more  accurate  PSD  estimates  than  the  commonly  used  methods. 


Introduction 

Auto-regressive  (AR)  modeling  continues  to  play  a  very  important  role  in  model-based  spectral  estimation 
[1-4].  A  major  reason  for  the  wide  appeal  of  Auto-regressive  (AR)  modeling  is  its  computational  simplicity. 
Specifically,  the  standard  AR-methods  such  as.  Covariance  method  or  Autocorrelation  method  or  their  variations 
only  need  to  solve  a  set  of  linear  equations.  Furthermore,  in  estimating  ARMA  or  MA  models,  AR-parameter 
estimation  is  a  necessary  intermediate  step  [Ij.  But  there  remains  a  fundamental  problem  with  most  AR-modeling 
methods  and  that  is  with  regards  to  the  sensitivity  of  the  AR  spectral  estimators  to  observation  noise.  Noisy 
observation  samples  are  indeed  very  common  in  practice,  and  the  performance  of  the  existing  estimators  deteri¬ 
orate  drastically  in  such  cases.  There  have  been  some  previous  attempts  to  address  this  problem.  AR-model  in 
noise  being  a  special  type  of  ARMA  model,  this  property  has  been  used  in  [8,  9],  but  this  makes  the  estimation 
problem  highly  nonlinear.  Another  suggested  solution  has  been  to  model  the  process  as  large-order  AR  model 
so  as  to  reduce  the  estimation  bias,  but  this  may  lead  to  spurious  peaks  if  the  chosen  model  order  is  too  high 
[11].  Other  methods  suggest  noise  compensation  to  remove  the  bias  but  this  requires  prior  information  about  the 
observation  noise  [10]. 

The  main  goal  of  this  Section  is  to  utilize  certain  data-prefiltering  ideas  which  have  been  found  to  be  highly 
effective  in  estimating  sinusoidal  frequencies  from  noisy  data  [5,  6]  and  also  for  identifying  deterministic  systems 
from  Input-Output  data  [12]  and  Impulse  Response  Data  [13].  It  is  well-known  that  a  sinusoidal  process  can  be 
viewed  as  a  limiting  case  of  a  narrowband  AR-process.  Indeed,  the  peak  locations  of  AR-spectra  are  commonly 
used  as  the  estimates  of  frequencies  [1].  But  the  poor  performance  of  AR-methods  with  noisy  data  also  causes 
inferior  frequency  estimates  at  low  SNR.  In  order  to  alleviate  this  problem,  a  large  class  of  methods  based  on 
principal- component  (PC)  analysis,  have  been  developed  for  reducing  the  effect  of  noise  in  data  [3,  7].  But  the 
PC-based  methods,  though  highly  effective  for  tone-frequency  estimation,  can  not  be  used  for  cleansing  noisy 
AR-data.  This  is  because  the  data  and  correlation  matrices  are  theoretically  full-rank  in  this  case  even  when 
there  is  no  observation  noise  at  all.  A  new  class  of  algorithms,  referred  to  as  KiSS  or  IQML,  have  been  developed 
recently  for  Maximum- Likelihood  frequency  estimation  [5,  6].  The  KiSS  algorithm  essentially  prefilters  the  noisy 
data  by  iteratively  minimizing  the  projection  of  the  observations  onto  the  noise  subspace  formed  with  linear 
predictor  type  polynomial  coefficients.  It  is  shown  in  this  work  that  this  matrix-prefiltering  approach  also  has  the 
desired  noise-reduction  effect  on  pure-AR-in-noise  data.  Extensive  simulation  studies  indicate  that  the  proposed 
prefiltering  produces  more  accurate  AR-spectra  than  the  conventional  AR-modeling  approaches. 
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Formulation  of  Data- Adaptive  Prefiltering 


The  proposed  approach  may  be  best  explained  by  outlining  the  initialization  step  and  the  noise-subspace 
projection  utilized  by  the  KiSS-IQML  algorithm  [5,  6].  In  that  algorithm,  the  frequency  estimation  problem  is 
essentially  transformed  into  an  AR-type  polynomial  estimation  problem.  Specifically,  let, 

B{z)  A  6o  +  +  •••  +  bpZ-P  (1) 

be  a  p"*  degree  z-polynomial  with  roots  at  . . .  e•’"^  respectively.  The  coefficient  vector, 

b  A  [6o  6i  •  •  •  bpf  (2) 

is  estimated  by  minimization  of  the  following  error  criterion  : 

min  b^X^(BB^)“^Xb  where,  (3a) 

{blLo 


bp  . . .  bo 


and 
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..  boj 

x{p  -  1) 

x(0)  \ 

x(jp) 

a;(l) 

(3c) 

x{N  -  2) 

. . .  x(N  —  p  —  1)/ 

(3d) 

A  ( g  I  G ) .  (3d) 

The  weighted-quadratic  structure  in  (3a)  is  utilized  for  minimizing  the  criterion  iteratively.  At  the  (j  -f  l)-th 
iteration,  the  weight  matrix  (BB^)  is  formed  with  the  estimate  of  b  found  at  the  i-th  iteration  and  the  following 
criterion  is  minimized  to  obtain  the  updated  estimate  : 

min  b"[X"(BWB"^’V^X]b.  (4) 


The  iterative  algorithm  in  (4)  is  initialized  with,  b  =[10  ...  0]^.  Hence,  the  initial  estimator  has  the  following 
form  : 


min  b^X^Xb  (5a) 

b 

=  min  ||Xb|p.  (56) 

b 

Interestingly,  this  criterion  is  exactly  identical  to  the  ‘Covariance  Method’  of  linear  prediction  used  in  AR-modeling 
[4].  If  the  data  contains  no  observation  noise,  the  minimization  in  (5)  would  indeed  produce  exact  frequency 
estimates.  Furthermore,  the  performance  of  covariance  method  for  modeling  pure  AR-processes  without  any 
observation  noise  is  also  known  to  be  quite  good  [4].  But  the  performance  deteriorates  drastically  with  noisy 
observation  data.  In  fact,  simulations  indicate  that  even  at  reasonably  high  SNR  of  30-35  dB,  Covariance  (or 
Autocorrelation)  method  may  not  be  able  to  distinguish  closely  spaced  peaks  or  frequencies  in  the  underlying 
process. 
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In  order  to  improve  on  the  initial  estimate  obtained  using  (5),  the  criterion  in  (4)  is  iteratively  minimized  in 
KiSS/IQML.  But,  the  original  criterion  in  (3a)  also  has  the  following  equivalent  forms  : 


b^X"(BB")-iXb  =  b"x"(BB^)-‘BB^(BB")-‘Xb 

(6a) 

=  x"b"(BB^)-^BB^(BB")-^Bx 

(66) 

=  x^Pb//Pb"X 

(6c) 

=  I|Pbhx||^ 

(6d) 

=  |1B^(BB")~^Bx1P 

(6e) 

=  llWBxlp 

(6/) 

=  l|WXb||^ 

{^9) 

where,  Pbh  A  B^(BB^)“^B  denotes  the  ‘projection  matrix’ of  B^, 

W  A  B"(BB^)“^ 

(7a) 

is  a  weighting  matrix  and 

X  A  [x(0)  x(l)  ...  ^(iV'  — 1)]^], 

(76) 

is  the  observation  vector. 

Equation  (6d)  shows  that,  in  order  to  reduce  the  effect  of  noise  in  this  AR-type  parameter  estimates,  the 
projection  of  the  data  (x)  onto  the  column-space  of  the  matrix  needs  to  be  minimized.  The  criterion 
in  equation  (6g)  is  similar  to  the  criterion  in  (5b)  for  Covariance  method,  except  that  in  (6g)  the  projection 
operation  essentially  prefilters  the  data  matrix  X  by  the  weight  matrix  which  is  formed  by  the  coefficients 
estimated  at  the  previous  iteration  step.  The  most  obvious  conclusion  from  this  discussion  is  that  the  noise- 
suppression  capability  of  KiSS-IQML  is  essentially  due  to  this  prefiltering  of  the  data-matrix  (X)  which  appears 
in  conventional  Covariance  method  for  AR  modeling. 

As  mentioned  before,  multiple  sinusoids  can  be  modeled  as  a  limiting  case  of  narrowband  AR-process  [1],  The 
analogies  noted  above  appears  to  lead  to  the  possible  hypothesis  that  similar  prefiltering  operation  may  be  equally 
effective  in  reducing  noise  effects  on  the  AR  parameter  estimates  also,  especially  for  narrowband  AR-processes. 
The  algorithm  outlined  next  essentially  minimizes  the  projection  of  the  data  onto  the  column-space  of  in 
order  to  obtain  improved  estimates  of  the  AR-parameter  vector  b. 

Steps  for  the  Proposed  Prefiltering  Algorithm 

1  :  Obtain  the  initial  estimate  of  the  AR-parameters  in  b  using  any  of  the  conventional  AR-modeling  methods. 

2  :  Form  the  B^*^  matrix  defined  in  (3b)  using  the  estimate  of  b  found  in  the  previous  iteration. 

3  :  Minimize  the  criterion  (4b)  to  obtain  an  updated  estimate  of  b,  which  has  the  following  form  : 

b(‘+^)  =  I  . ^ .  I  (8) 

\  -(G" )-i G)-1G^(B(0b"''  )“^g  / 

where,  B^’^  denotes  the  matrix  obtained  is  Step-2  whereas,  g  and  G  are  defined  in  (3d). 

4  :  Go  to  Step-2  unless  ||b('+^)  -  <  6,  where  6  is  a  small  number. 

An  important  difference  between  KiSS-IQML  algorithm  and  the  proposed  method  is  that  in  case  of  KiSS- 
IQML,  conjugate-symmetry  constraints  need  to  be  imposed  on  the  coefficients  of  the  B(^)-poIynomial  in  an 
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attempt  to  constrain  the  roots  to  lie  on  the  unit  circle.  This  makes  the  optimization  problem  even  more  nonlinear. 
But  in  the  present  case  no  such  constraints  are  necessary  and  hence,  the  optimization  in  (4)  is  more  straight¬ 
forward.  Extensive  simulation  studies  have  shown  that  this  algorithm  does  produce  better  AR  spectrum  match 
at  lower  SNR  than  any  of  the  standard  AR-modeling  techniques. 

Simulation  Results 

The  test  data-set  given  in  Chapter-7  in  [3]  was  generated  for  the  simulation.  Fig.  1.  illustrates  the  per¬ 
formance  of  the  Covariance  method  for  50  independent  realizations  of  the  observation  data  at  30dB  SNR.  The 
solid  line  in  Fig.  2  shows  the  average  of  estimated  spectra  of  the  50  realizations  and  the  dashed  line  shows 
the  true  spectrum.  Figures  3-4  and  5-6  show  the  corresponding  results  with  Modified  Covariance  Method  and 
Autocorrelation  Method,  respectively,  with  identical  data  sets.  The  results  clearly  demonstrate  that  even  at  this 
moderately  high  SNR,  none  of  these  commonly  used  methods  were  able  to  distinguish  the  two  spectral  peaks 
for  most  of  the  noise  realizations.  Fig.  7  shows  the  results  of  the  proposed  prefiltering  algorithm  for  those  50 
identical  realizations  at  the  same  SNR.  The  iterations  converged  in  6-8  iterations  in  all  cases.  Fig.  8.  shows  the 
average  of  the  50  realizations  with  the  true  spectrum.  This  improvement  was  found  to  be  consistent  even  at  lower 
SNR  values.  Similar  improvements  have  also  been  observed  when  the  Auto- correlation  method  and  the  Modified 
Covariance  method  were  used  to  generate  the  initial  AR  parameter  estimates.  The  plots  clearly  demonstrate 
that  the  proposed  method  was  able  to  match  the  AR-spectra  more  closely.  With  simulated  data,  the  average 
prediction  error  power  for  the  proposed  estimator  was  also  found  to  be  much  smaller  than  the  standard  methods. 
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Section  -  2.4  :  IMPROVED  ARMA-PARAMETER  ESTIMATION  FROM  NOISY  OBSERVATION  DATA 


Summary 

Existing  methods  for  ARMA  modeling  assume  that  the  available  process  is  produced  by  an  ARMA  system 
driven  by  a  white  input  process,  Le,,  the  observed  process  is  considered  to  be  pure  ARMA.  In  practice,  the 
available  data  usually  have  observation  noise  added  to  it  but  the  ARMA  methods  do  not  address  this  problem. 
Simulations  show  that  performance  of  the  existing  ARMA  methods  deteriorate  when  the  observation  process  is 
noisy.  In  this  Section  a  new  ARMA  algorithm  is  given  which  utilizes  a  recently  developed  deterministic  rational 
system  identification  method  (OM-IO)  [8]  that  minimizes  the  modeling  or  output  error  norm.  The  algorithm  first 
estimates  the  input  process  and  then  invokes  OM-IO  using  the  input-output  data.  Simulations  indicate  that  the 
proposed  method  is  quite  effective  even  at  low  SNR  observation  data. 


Introduction 

With  both  poles  and  zeroes,  ARMA  models  are  best  capable  of  effectively  representing  general  spectra  with 
possibly  sharp  peaks  as  well  as  deep  valleys.  Modeling  of  ARMA  processes  involves  solution  of  a  set  of  highly 
nonlinear  equations.  Existing  methods  divide  the  problem  into  several  ^equation  error’  minimization  problems 
to  estimate  the  AR  and  MA  parameters  in  several  stages.  The  estimation  problem  is  further  complicated  if  the 
available  data  is  also  corrupted  with  observation  noise.  In  fact,  simulation  studies  indicate  that  the  performance  of 
the  existing  ARMA  modeling  methods  deteriorate  significantly  with  noisy  data.  This  drawback  may  be  attributed 
to  the  sensitivity  of  equation-error  minimization  based  methods  to  the  presence  of  noise.  In  this  Section,  we 
propose  to  address  this  problem  by  incorporating  a  recently  developed  optimal  algorithm  for  identification  of 
deterministic  ARMA  systems  [8]  into  the  stochastic  ARMA  modeling  problem.  Recent  results  indicate  that 
estimators  based  on  minimizing  model  ‘fitting-error’  have  superior  performance  when  compared  to  those  which 
rely  on  equation-error  minimization  [8,  13].  In  view  of  this,  unlike  existing  ARMA  methods,  the  algorithm 
presented  in  this  work  minimizes  output  or  modeling  errors.  The  results  obtained  so  far  indicate  that  the  proposed 
approach  is  much  more  effective  than  existing  methods  for  ARMA  parameter  estimation  when  the  available  data 
is  not  purely  ARMA  but  has  some  observation  noise  added  to  it. 

The  ARMA  MODEL 


An  ARMA(p,  q)  process  can  be  represented  in  a  linear  difference  equation  form  as, 

p 


x{n)  =  ^  bkx{n  “■  ^)  +  ]^  aku{n  -  k) 

(1) 

k=i  fc=o 

where,  the  corresponding  z-domain  transfer  function  has  the  following  form  : 

n(  \  ao -I- aiz“^  H- - ^  A{z) 

+  62Z-2  +  •  •  •  +  hj,z-P  =  B{z)  ■ 

(2) 

Let, 

a  A  [ao  ai  •  •  •  OqY  and 

(3a) 

b  A  [1  6i  ■■■bpf 

(36) 

denote  the  unknown  MA  and  AR  parameters,  respectively.  In  vector  form, 

X  A  [x(0)  a:(l)  •  •  •  x{N  —  1)]^  and 
A  [t/(0)  u(l)  •  •  •  u{N  —  1)], 


u 
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(4a) 

(46) 


denote  the  output  data  and  the  driving  noise  sequences,  respectively. 

Previous  Methods 

Estimation  of  the  Maximum- Likelihood  (ML)  parameters  of  an  ARMA  process  is  a  highly  nonlinear  problem. 
Akaike’s  MLE  method  requires  nonlinear  optimization  which  are  prone  to  poor  convergence  if  the  initial  estimates 
are  chosen  properly  [4].  To  overcome  the  complexities  of  MLE,  many  computationally  attractive  techniques  have 
been  developed  also.  Among  these,  the  Modified  Yule- Walker  equations  (MYWE)  method  estimates  the  AR 
parameters  from  the  tail-end  of  Yule-Walker  (Y-W)  equations,  i.e.,  from,  rxx{k)i  ^  -f  1,  •  • . ,  The  output 

process  is  then  filtered  by  the  estimated  A{z)  filter,  which  results  in  an  MA  process  from  which  the  MA  parameters 
can  be  determined  by  any  standard  procedure  for  MA  parameter  estimation  [9].  An  extension  of  this  approach, 
known  as  the  Least-Squares  MYWE  (LSMYWE)  [10],  uses  more  of  the  tail-end  of  Y-W  equations  and  yields 
better  results  than  MYWE. 

In  stochastic  ARMA  modeling,  the  driving  white  noise  sequence  is  completely  unknown.  Clearly,  if  it  were 
somehow  possible  to  have  some  estimate  of  the  driving  noise  w(7i),  then  any  input-output  system  identification 
technique  could  be  used  to  estimate  the  ARMA  parameters.  Two  well-known  ARMA  methods  are  indeed  based 
on  this  principle,  namely,  Two-Stage  Least-Squares  [5]  and  Three-Stage  Least-Squares  [7,  12].  The  primary  steps 
in  these  methods  are  to  model  the  output  data  first  as  a  large  order  AR  process,  then  a  prediction  error  sequence 
is  obtained  by  passing  the  data  through  the  inverse  filter  which  is  MA.  This  whitened  prediction  error  sequence 
is  used  as  the  estimate  of  the  input  white  noise  sequence  «(«),  With  this  estimated  input  and  the  observed 
output,  the  ARMA  parameters  are  then  found  by  minimizing  equation  errors  in  two  [5]  or  three  stages  [7].  The 
three-stage  approach  has  been  shown  to  have  lower  variance  than  the  two-stage  case.  But,  as  will  be  shown  with 
simulations  below,  even  the  three-stage  algorithm  can  not  perform  well  when  the  observation  data  is  noisy,  which 
is  quite  possible  in  practical  situations. 

It  may  be  mentioned  here  that  in  [11]  a  data-adaptive  prefiltering  method  has  also  been  proposed  for  improved 
modeling  of  AR-parameters  from  noisy  observation  data.  As  noted  in  [11],  there  have  been  some  previous  work  on 
AR-modeling  from  noisy  data,  but  the  author’s  are  not  aware  of  any  such  work  for  modeling  ARMA  parameters 
from  noisy  data,  which  is  the  problem  considered  in  this  work. 

The  Proposed  Idea 

Instead  of  minimizing  the  equation  error  criterion  as  in  [5,  7],  the  proposed  algorithm  minimizes  the  modeling 
error  OT  output  error  criteria.  This  is  also  a  nonlinear  problem,  but  a  recently  developed  input-output  identification 
method  optimally  decouples  the  numerator  and  denominator  problems  [8]  into  two  separate  problems  of  smaller 
dimensions.  The  decoupled  estimators  retain  the  global  optimum  of  the  original  criterion.  It  has  been  further 
shown  in  [8]  that  in  the  decoupled  form,  estimation  of  the  numerator  a  is  a  purely  linear  problem  whereas  the 
estimation  of  the  denominator  is  a  nonlinear  problem  of  reduced  dimensionality.  But  the  nonlinear  criterion  for 
the  denominator  possesses  a  convenient  weighted-quadratic  structure  which  can  be  easily  exploited  to  estimate  the 
denominator  iteratively.  Preliminary  simulation  studies  show  that  the  proposed  method  outperforms  the  existing 
ARMA  modeling  approaches  when  the  observed  data  is  corrupted  with  noise.  Brief  explanation  of  the  underlying 
theory  along  with  the  algorithm  steps  are  in  order.  Some  simulation  results  included  at  the  end  demonstrate  the 
superior  performance  of  the  proposed  method. 

FORMULATION  OF  THE  ESTIMATION  PROCEDURE 

Let, 

y{n)  =  x{n)  -h  v(n),  (5) 

be  the  observed  noisy  ARMA  process,  where  v{n)  denotes  the  observation  noise  process.  Let, 

y  A  [y(0)  y(l)---  y(7V-l)f  (6) 
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denote  the  noisy  output  vector.  Using  covariance  method,  this  observation  data  is  first  modeled  as  a  large  order 
(=L)  AR  model  to  obtain  an  AR-polynomial  B^{z)  such  that  L  »  p,  the  true  AR-order  of  the  underlying 
ARM  A  process.  The  observation  sequence  y{n)  is  then  filtered  through  an  MA-filter  in  the  form  of  B^{z)  to 
obtain  a  whitened  prediction  error  sequence  t/(n).  Then  using  «(n)  as  the  estimate  of  the  input  sequence,  the 
ARMA  modeling  problem  can  be  restated  as  the  following  output-error  minimization  problem  : 


N-l 

min  XI 

7^0 


B{z) 


(7) 


It  has  been  shown  in  [8]  that  the  above  nonlinear  problem  can  be  decoupled  into  a  purely  linear  problem  to 
estimate  a  and  a  nonlinear  problem  for  b.  Such  decoupling  techniques  have  also  been  found  to  be  very  effective 
in  Maximum  Likelihood  estimation  of  the  parameters  of  multiple  exponential  models  [2,  3].  It  may  be  noted  here 
that  the  estimators  in  [6,  7]  utilize  the  estimated  prediction  error  sequence  u(n).  But  those  estimators  do  not 
minimize  the  true  model-fitting  defined  in  [7],  but  are  based  on  minimizing  equation  error  norms.  The  following 
definitions  are  necessary  to  formulate  the  decoupling  of  the  numerator  and  denominator  optimization  problems. 


Let  Hi{z)  be  an  inverse  filter  corresponding  to  B(z),  i.e., 

Biz)Hi{z)  =  1. 


(8) 


By  writing  this  convolution  in  matrix-form,  it  can  be  shown  that  [8], 

'h+i-  ^0  0  .  0-|  r  ftj(o) 


BHi,  A 


bo 


L  0 


hiq) 


6oJ  LM^-1) 


/i»(0) 


ht{N-q-l) 


=  0, 


(9) 


which  leads  to  the  conclusion  that  is  orthogonal  to  the  matrix  Hj.  Utilizing  this  orthogonality  relationship, 
the  optimal  criterion  for  estimating  the  denominator  can  be  shown  to  be  [8]  : 

min  y"ufB(B"U/UfB)-^B"U/y 


=  min  b"z"(B^U/UfB)-iZb 
b 

where,  U  is  a  lower-triangular  convolution  matrix  formed  with  the  estimated  input  sequence  w(n), 


(10) 
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«(0) 
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0 


e  iR^x^. 


L«(7^-1)  u(N-2)  ...  w(0)J 
The  matrix  U/  is  the  inverse  of  U  and  is  also  lower  triangular. 


z  A  U/y  and 
Z  A  B^z, 


where,  the  matrix  Z  has  the  following  Toeplitz  structure, 
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where,  g  is  a  column  vector  formed  with  the  leading  column  of  Z.  The  denominator  vector  b  is  estimated 
by  minimizing  the  criterion  in  (10).  The  minimization  is  performed  iteratively  by  forming  the  weight-matrix 
(B^U/Uj^B)  with  the  estimate  of  b  obtained  at  the  previous  iteration  step.  At  convergence  of  the  iterations, 
the  estimated  b  is  used  to  form  the  matrix  Hf,  using  its  inverse  sequence  as  in  (8).  Then  the  least-squares  solution 
of  the  numerator  a  is  found  as, 

a  A  (UHO^y  (12) 

where,  ^  denotes  matrix  pseudo-inverse.  The  iterative  process  is  initialized  by  estimates  obtained  by  minimizing 
equation  errors  as  in  [6,  7,  10],  Hence,  the  further  iterations  of  the  proposed  method  can  only  improve  upon  the 
equation-error  based  estimates  because  it  minimizes  the  true  modeling  error  criterion  defined  in  (7). 

The  Overall  algorithm  in  Brief 

The  complete  algorithm  for  the  ARMA  parameter  identification  can  be  summarized  as  the  following  four 
primary  steps  : 

1.  Model  the  observed  sequence  y  by  a  large  order  AR  model. 

2.  Determine  the  prediction  error  white  noise  sequence  u,  which  is  treated  as  the  input  sequence  for  the  system. 

3.  Knowing  u  and  y,  start  the  iterative  procedure  to  minimize  the  error  criterion  in  (10).  At  each  iteration, 
b  is  estimated  either  as  the  eigenvector  corresponding  to  the  minimum  eigenvalue  of  Z^(B^U/U^B)“^Z 
or  by  setting  6o  =  1.  In  the  later  case,  the  estimate  of  b  at  the  (i  +  l)-th  iteration  is  obtained  using  the 
estimates  of  the  previous  iteration  step  as  follows, 


b(*+')  =  .  ,  (13a) 

.  -(G^  W(‘)  G)- 1  G"WF(0w(’)g . 

where,  the  matrix  W  is  formed  with  the  estimates  of  b  at  the  previous  iteration  step  as, 

W  A  uf  (B^U/Uf  B)-^  (136) 


The  iterations  are  continued  till  convergence  is  reached,  i.e.,  no  significant  change  is  found  in  b  between 
successive  iterations. 

4.  Estimate  a  using  (12). 
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SIMULATION  RESULTS 


The  simulations  were  performed  with  the  test  data  set  (ARMA4)  used  in  Chapter-10  of  [1].  The  true  system 
PSD  has  two  prominent  peaks  as  shown  in  Fig.  1.  The  PSD  estimates  at  an  SNR  of  20dB  in  the  observation 
data  y{n)  are  shown  in  Fig.  2.  The  results  using  MYWE,  LSMYWE,  Maine-Firoozan  and  the  proposed  method 
are  shown  in  Figures  2a  through  2d.  Clearly  the  proposed  method  performs  better  than  the  other  three.  The 
corresponding  results  with  15dB  SNR  are  shown  in  Fig.  3a  -  3d.  The  performance  of  the  proposed  method 
is  maintained  even  at  this  level  of  SNR,  though  the  results  with  the  three  existing  methods  have  deteriorated. 
Further  simulations  at  lower  SNR  levels  indicate  that  the  peaky  spectral  shape  is  maintained  at  least  up  to  12dB. 
The  efficacy  of  the  algorithm  is  obvious. 
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Section  -  2.5  :  TiME-DOMAIN  DETECTION  OF  ELECTRONIC  WARFARE  SIGNALS  IN  NOISE 
Summary 

In  the  passive  mode  of  operations  of  EW  applications,  source  signals  may  not  be  present  within  a  given 
observation  window,  or  the  signals  may  fill  only  a  part  of  the  estimation  window.  In  either  case,  any  frequency 
estimation  algorithm  may  produce  erroneous  or  noise  frequencies.  Considering  the  relatively  high  computational 
burden  of  any  frequency  estimation  method,  it  is  desirable  to  invoke  a  frequency  estimation  method  only  when 
a  detection  scheme  indicates  high  probability  of  presence  of  threat.  In  this  Section,  the  theory  of  detection  of 
sinusoids  from  Quantized  and  Noisy  time-domain  observation  samples  have  been  developed.  The  theoretical  work 
on  single/multiple  samples  is  mostly  complete.  Studies  with  Quantized  data  have  also  been  performed  and  the 
results  appear  reasonably  good.  Lab  tests  for  the  Envelope  Detection  and  Square-Law  cases  have  been  conducted 
at  Wright  Labs  with  satisfactory  results. 

I.  Introduction  : 

In  Electronic  Warfare  (EW)  environments,  microwave  receivers  play  a  major  role  in  passive  identification  and 
localization  of  unknown  targets  emitting  high-frequency  electro-magnetic  signals.  EW  signals  cover  a  relatively 
wide  bandwidth,  typically  in  the  range  of  0.2  to  15  GHz,  and  existing  microwave  receivers  utilize  mostly  analog 
signal  processing  tools  and  techniques  [1-3].  In  fact,  there  are  no  EW  receivers  that  process  microwave  radar  signals 
entirely  in  the  digital  domain.  With  the  emergence  of  increasingly  faster  and  inexpensive  digital  computers  and 
high-speed  A/D  converters,  it  is  expected  that  digital  processing  of  microwave  radar  signals  is  expected  to  be 
practically  feasible. 

The  primary  task  of  a  microwave  receiver  is  to  gather  data  for  sorting  of  signals  and  identification  of  the  type 
of  the  radar  emitting  the  received  signal.  Based  on  these  information,  jamming,  weapon  delivery  or  other  options 
are  considered.  In  order  to  perform  these  tasks,  the  receiver  must  analyze  the  received  radar  pulses  and  measure  or 
estimate  the  following  six  parameters  :  Angle-of- Arrival  (AO A),  Radio  Frequency  (RF),  Time  of  Arrival  (TOA), 
Pulse  Amplitude  (PA),  Pulse  Width  (PW)  and  Polarization  (P).  But  in  order  to  reduce  computational  burden,  the 
estimation  of  these  parameters  should  be  undertaken  only  when  it  is  determined  that  there  is  a  high  probability 
of  the  presence  of  a  threat  signal. 

In  this  part  of  the  project,  the  detection  problem  has  been  considered  in  the  time-domain  for  single  and 
multiple  samples.  Detection  thresholds  and  Probability  of  Detection  based  on  Neyman-Pearson  Criterion  have 
been  derived.  Derivations  are  given  for  calculating  the  Thresholds  and  Probability  of  Detection  for  both  the 
‘Square-Law’  and  ‘Envelope’  detectors. 

II.  Time- Domain  Detection  : 

Almost  all  existing  AOA/RF  estimation  algorithms  assume  that  the  signal  is  already  present  in  the  observed 
data.  But  in  the  passive  mode  of  operations  of  EW  applications,  source  signals  may  not  be  present  at  all  within 
the  observation  window,  or  the  signals  may  fill  only  a  part  of  the  estimation  window.  In  either  case,  any  frequency 
estimation  algorithm  would  essentially  produce  erroneous  or  noise  frequencies  because  the  observed  signal  would 
not  satisfy  the  model  assumed  by  the  estimation  algorithm.  Considering  the  relatively  high  computational  burden, 
any  estimation  method  should  be  invoked  only  when  a  detection  scheme  indicates  high  probability  of  presence  of 
threat. 

Since  EW  receivers  do  not  have  any  prior  knowledge  about  the  frequency /amplitude/phase  of  the  received 
signals,  conventional  matched  filters  can  not  be  used  in  this  case.  An  obvious  solution  would  be  to  perform  the 
detection  in  the  frequency-domain,  i.e.,  the  presence  of  targets  can  be  determined  by  thresholding  of  FFT-peaks. 
The  frequency- domain  approaches  are  robust  but  have  certain  disadvantages  in  that  a  decision  can  be  made  only 


62 


after  a  block  of  data  has  been  collected.  Furthermore,  a  lot  of  computational  power  may  be  wasted  if  FFT  is  taken 
continuously,  even  when  no  target  is  present.  Instead,  we  plan  to  incorporate  a  time-domain  detection  scheme 
that  can  detect  targets  in  real-time  using  a  single  observation  or  a  small  number  of  samples.  Once  a  preliminary 
decision  is  taken,  FFT  or  more  sophisticated  frequency/AOA  estimation  algorithm  can  be  invoked,  if  desired. 

II.1  :  Signal  and  Noise  Model 


Microwave  radars  signals  can  be  modeled  as, 
a;(n)  =  Acos{L,Jck  +  0)  +  n(k) 

=  Acos(wcl:)cos^  -  Asin(wcl;)sin0  +  ni{k)cos(LJck)  +  nQ(k)sm(wck)  (16) 

where,  n(k)  denotes  narrowband  noise  samples.  To  perform  the  time-domain  detection,  the  received  real  data 
IS  first  converted  into  a  complex  analytic  signal.  This  is  achieved  by  passing  the  real  signal  through  a  Hilbert 
Transformer  to  form  the  in-phase  (I)  and  quadrature  (Q)  components  of  the  complex  analytic  signal.  When  no 
signal  is  present,  the  I  and  Q  components  may  be  represented  as. 


Xi{k)  =  ni(k)  (2a) 

Xqik)  =  nQ(k).  (26) 

On  the  other  hand,  in  the  presence  of  signal,  the  corresponding  components  are  given  as  : 

Xi{k)  =  Acos$  +  ni(k)  (3a) 

XQ(k)  =  Asin^  -f  nQ(k).  (36) 


Since  the  amplitude,  frequency  and  the  phase  of  the  received  signal  are  unknown,  the  detection  criterion  has 
to  rely  on  thresholding  of  the  amplitude  (PA)  of  the  analytic  signal.  The  frequency  and  phase  can  be  ignored 
for  detecting  only  the  presence  of  a  target  signal.  The  amplitude  threshold  can  not  be  based  on  minimizing  the 
total  probability  of  error  because  the  exact  amplitude  of  the  signal  is  unknown  at  the  receiver.  Furthermore,  the 

probability  of  False  Alarm  (Pfa)  naust  also  be  kept  very  low  (10“®  or  smaller).  Hence,  the  best  detection  scheme 

would  be  to  calculate  the  threshold  by  setting  the  Pfa  to  a  constant.  The  thresholds  for  Square-Law  detector 
have  been  derived  next  for  single  and  multiple  samples  within  a  pulse. 

II.2  :  Square  Law  Detector 

The  noise  is  assumed  to  be  narrowband  and  Gaussianly  distributed  with  zero-mean  and  variance  = 
Hence,  for  the  no-signal  case  of  (2)  the  I/Q  noise  samples  are  distributed  as  : 

Xjik)  =  N(0,a^)  (4a) 

Xqik)  =  N{0,(r^).  (46) 

In  the  following  derivation,  the  time- variable  k  will  be  suppressed  until  the  multiple  samples  Ccise  is  considered. 
II.2.a  :  Single  Sample  Case 

Assuming  independent  noise  samples,  when  no  signal  is  present,  the  joint  probability  density  function  (PDF) 
of  the  I/Q  channel  outputs  are  given  by  : 


f(Xi,Xq)  = 

(5) 

Xj  =  Rcosa 

(6a) 

Xq  =  Rsina. 

(66) 
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Using  the  Jacobian  of  this  transformation,  the  joint  PDF  for  this  polar  form  can  be  shown  to  be  : 

r  _ 

fir,  a)  =  2^®  377 «(r). 

From  this  the  marginal  for  the  Envelope  (R)  is  given  by. 


/•27r 

fnir)  =  /  f(r,a)da  = 

Jo 


which  is  known  as  the  Rayleigh  PDF. 

11.2.a.l  :  The  PDF  and  Characteristic  Function  with  No  Signal 

A  square-law  detector  forms  the  following  quantity, 

Z  AX]  +  (9) 

which  needs  to  be  compared  to  a  threshold  to  decide  the  presence/absence  of  a  radar  target.  Since, 
^  =z  2R  =  2y/Z,  the  PDF  of  the  Square-Law  output  when  no  signal  (denoted  as,  s)  is  given  as  : 

(10) 

which  is  the  Exponential  PDF.  The  Characteristic  Function  (CF)  is  defined  as  the  Fourier  Transform  of  the 
Density  function  : 


C|(u))  A  nfzizm 


-  -L  r 


e  2^^  e 


1  -i-  j2u}a^ 


II.2.a.2  :  The  PDF  and  Characteristic  Function  in  Presence  of  Signal 

When  target  is  present,  Le.,  in  case  of  (3),  the  I/Q  samples  are  distributed  as  : 

Xi{k)  =  W(Acos6>,(72) 

XQ{k)  =  J\r(A  cos  cr^). 

In  this  case,  the  joint  probability  density  function  (PDF)  is  given  by  : 

fixi.xn)  =  -  Acostf)=  +  (XQ  -  yl8in«)^] 

Once  again,  using  the  Jacobian  of  the  transformation,  the  polar-form  joint  PDF  can  be  shown  to  be 

/(r,a|s)  = 

Integrating  over  a,  the  marginal  PDF  of  the  Envelope  is  given  by. 


y27r 

/R(r|s)  =  /  f{r,a\s)da 

Jo 


=<>«(<»  -  »)dc 


=  +  ’•’)j 

<T^ 
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where,  Jo(  )  denotes  Bessel  Function  of  the  zero-th  kind.  The  PDF  in  (15c)  is  known  as  the  Rician. 

Similar  to  the  no-signal  case  in  (9)-(10),  the  PDF  of  the  Square-Law  output  Z  with  signal-plus-noise  is  given 


as  : 


fz{z\s)  = 


^  ■  2<r> . \ 

In  this  case,  the  Characteristic  Fnnction  can  be  found  as  follows  : 

C%iu;)  A  y^[fz{z\s)] 

The  following  Fourier  Transform  pair  can  be  found  in  [CF,  page-79,  pair  655.1]  : 


(16) 


e-o^o 


V  vX/J  u)  +  p 

Using  (18)  and  with  appropriate  substitutions,  C^i^)  is  given  by  : 

1  _ _ ?<■"* 


e  +  o) . 


C^z(w)  = 


1  +  j2uja^ 


g  1  +  j2w<r 


(17) 


(18) 


(19) 


II.2.a.3  :  The  Neyman-Pearson  Criterion  with  a  Single  Sample  ; 

For  this  one-dimensional  case,  the  decision  that  the  signal  is  present  is  taken  if  the  likelihood-ratio  [17]  : 

fziz\s) 


t  =  >  KPfa) 

fziz\s) 


(20) 


where  jk  is  a  constant  that  depends  on  the  probability  of  False- Alarm  Pfa-  From  this  relationship  it  may  appear 
that  in  order  to  find  the  decision  threshold,  one  would  need  to  know  or  estimate  the  signal.  But  one  of  the  most 
attractive  consequence  of  Neyman-Pearson  criterion  is  that  for  a  given  predetermined  PpA,  the  threshold  can  be 
set  by  integrating  /(z|s)  over  the  region  where  the  signal  is  present  [11,  17]. 

II.2.a.4  :  Probability  of  False  Alarm 

If  the  threshold  is  denoted  as  j,  the  false-alarm  probability  can  be  calculated  as. 


/OO 

fziz\s)dz 


1  r 

~  2<t2 
= 


e  2<r 


^dz  from  (10), 


(21) 


IL2.a.5  :  Detection  Threshold 

Taking  natural  logarithm  of  both  sides  of  (21),  the  detection  threshold  is  given  as, 

7  =  -2(TMnP>^. 


(22) 


II.2.a.6  :  Probability  of  Detection 
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If  the  square-law  output  2;  of  is  greater  than  7  from  (22),  then  the  decision  is  taken  that  source  target  is 
present.  Hence  the  probability  of  detection  can  be  calculated  from  : 


rOO 

Ph=  fz(z\s)dz 
Jy 

/  fziz\s)dz 

Jo 


=  1 


By  letting,  ^  and  with  appropriate  substitutions 

r\/i^ 


Ph  =  I  - 


e 


ve-'>^I,(2^vyv. 


(23) 


(24) 


But  this  integral  possesses  the  form  of  an  Incomplete  Toronto  Function  [13,  pp-348]  which  is  defined  as  follows  : 


rB 

rj3(m,n,r)  A  In(2rt)dt 

—  Jo 

Hence  can  be  written  in  a  more  compact  form  as  : 


(25) 


(26) 


11. 2.  b  :  Multiple  Samples  Case 

The  detector  performance  can  be  expected  to  improve  if  the  decisions  can  be  based  on  multiple  observations 
within  a  pulse.  The  question  would  then  be  how  to  combine  the  multiple  samples  in  order  to  come  up  with  an 
inference.  For  the  Envelope  Detection  case,  Tsui  and  Sharpin  have  recently  derived  an  M-out-of-7V  scheme  where 
the  presence  of  target  is  decided  if  at  least  M  samples  out  of  a  total  of  N  exceed  the  detection  threshold  [12], 
In  this  work  we  take  a  different  approach  where  decisions  are  taken  based  on  the  sum  of  N  squared  samples. 
This  approach  is  more  akin  to  traditional  CW  detection  schemes  where  integration  over  N  pulses  is  performed 
for  making  a  decision  [13]. 

Let  Y  be  the  random  variable  formed  with  the  sum  of  N  independent  squared  samples,  Le., 

N 

Y  A  Y^Z{k),  (27) 

fe  =  l 

where,  the  PDF  and  CF  of  Z{k)  were  derived  in  II, 2. a. 

11.2. b.l  :  The  PDF  and  Characteristic  Function  of  Y  with  No  Signal 

When  no  signal  is  present,  the  PDF  of  Y  which  is  formed  as  the  sum  of  N  independent  samples,  is  given  by 
the  following  convolution  : 

fY{y\s)  A  fz{zi\s)  ★  fz(z2\s)  ★  ★  fzizN\s)  (28) 

where,  each  of  the  Z{kys  has  identical  distribution.  Direct  convolution  of  N  PDFs  appears  complicated,  but  it 
is  well-known  that  convolution  in  PDF-domain  implies  multiplication  in  the  CF-domain.  Consequently,  the  CF 
of  Y  is  given  by, 

cH^)  = 

k-l 

^  (1  + 
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(30) 


Using  the  inverse  Fourier  Transform  pair-431  [Campbell  and  Foster,  pp-44],  the  PDF  of  Y  is, 


fvivls) 


(2ff2)JV(jV-  1)! 


Ky) 


II.2.b.2  :  The  PDF  and  Characteristic  Function  of  Y  in  Presence  of  Signal 

Using  arguments  similar  to  those  in  the  previous  subsection,  when  signal  is  present,  the  CF  of  Y  is  given  by. 


Cf  (a;)  =  n  cm  =  \Cmf 


ib=l 


1 

-€  1  +  i2w<r 


(1  +  j2wcr2)^ 

Once  again,  using  the  inverse  Fourier  Transform  pair-650.0  [Campbell  and  Foster,  pp-77],  the  PDF  of  Y  is, 

fY{y\s)  =  +  y)i^_,(±^y(y) 

I1.2.b.3  :  The  Neyman- Pearson  Criterion  with  Multiple  Samples  : 

For  this  N-dimensional  case,  the  decision  that  the  signal  is  present  is  taken  if  the  likelihood-ratio  [17]  : 


(31) 


(32) 


(33) 


For  a  given  predetermined  PpA,  the  threshold  can  be  set  by  integrating  /(j/|s)  over  the  region  where  the  signal 
is  present  [11,  17]. 

II.2.b.4  :  Probability  of  False  AlEU-m 

For  7  denoting  the  threshold,  the  false-alarm  probability  is, 

PpA  =  f  fYiy\s)dy 


T'  e~^y^~'^ 


dy 


(N  -  1)! 

=  1  -  l(  ^y=,N-l 

\2<r^y/N 

where,  /(•)  denotes  Incomplete  Gamma  Function  which  is  defined  as. 


I{u,t)  A  / 
—  Jo 


uVT+i 

t! 


dv. 


(34a) 

(346) 

(35) 


II.2.b.5  :  Detection  Threshold 


For  a  given  Pfa,  the  threshold  7  can  be  determined  numerically  with  a  computer  or  using  available 
plots/tables  [13]. 


67 


II.2.b.6  :  Probability  of  Detection 


If  the  sum-of-squares  y  is  greater  than  the  threshold  7  determined  from  (35),  then  the  decision  is  taken  that 
source  target  is  present.  Hence  the  probability  of  detection  can  be  calculated  from  : 


rOO 

Pd  =  fY{y\s)dy 
=  1  -  J  fYiy\s)dy 

(  2<t^  r  y  ^  ^ 


(A^N^ 

V  0-^ 


dy 


By  letting,  ^  and  with  appropriate  substitutions, 


_ 

e  2<r2 


-  -  1  - 

Pd  -  -  ^  2<t2 


(36) 


This  integral  also  possesses  the  form  of  an  Incomplete  Toronto  Function  defined  in  (25).  Hence  can  be  written 
in  a  more  compact  form  as  : 


IL3  :  Envelope  Detector 


The  PDF  of  the  envelope  (R)  for  a  single  sample  was  found  in  (8).  Hence,  for  a  given  Pfa^  the  detection 
threshold  is, 


7  =  \/— 2(t’^  In  Pf A  • 

(39) 

The  calculation  of  threshold  with  N  observation  samples  can  be  shown  to  be  [12], 

7  =  r^7v(2-|)  +  N^  where. 

(40a) 

T  is  found  approximately  from. 

Pfa  =  0.5(1  -  <I>-\T)) 

(406) 

and  (^(•)  denotes  the  error  function.  More  details  for  the  Envelope  Detector  case  can  be  found  in  [12]. 

It  may  be  noted  that  unlike  the  square  law  and  envelope  detection  threshold  calculations  for  conventional 
radars  [13],  the  discretized  schemes  presented  here  do  not  use  matched  filtered  output  but  use  the  sampled  data 
directly. 

References 

[1]  J.  B.  Y.  Tsui,  Microwave  receivers  with  Related  Components,  National  Technical  Information  Center,  1982, 
Peninsula,  Los  Altos,  CA,  1985. 

[2]  J.  B,  Y.  Tsui,  Microwave  Receivers  with  Electronic  Warfare  Applications,  John  Wiley  and  Sons.,  New  York, 
1986. 

[3]  D.  Curtis  Schleher,  Introduction  to  Electronic  Warfare,  Artech  House,  MA,  1986. 

[4]  J.  B.  Y.  Tsui,  Digital  Microwave  Receivers  :  Theory  and  Applications,  Artech  House,  MA,  1989. 


68 


[5]  J.  Y.  Cheung,  “A  Direct  Adaptive  Frequency  Estimation  Technique,”  30th  Midwest  Symposium  on  Circuits 
and  Systems,  New  York,  Aug.,  1987. 

[6]  S.M.  Kay,  Modem  Spectral  Estimation:  Theory  and  Applications,  Prentice  Hall,  Englewood  Clififs,  NJ,  1988. 

[7]  R.  Kumaresan,  L.  L.  Scharf  and  A.  K.  Shaw,  “An  Algorithm  for  Pole-Zero  Modeling  and  Spectral  Estimation,” 
IEEE  Transactions  on  Acoustics  Speech  and  Signal  Processing,  vol.ASSP-34,  pp.  637-640,  June,  1988. 

[8]  Y.  Bressler  and  A.  Macovski,  “Exact  Maximum  Likelihood  Parameter  Estimation  of  Superimposed  Expo¬ 
nential  Signals  in  Noise,”  IEEE  Transactions  on  Acoustics,  Speech  and  Signal  Processing,  vol.  ASSP-34,  no. 
10,  pp.  1081-1089,  Oct.,  1988. 

[9]  A.  K.  Shaw,  ”A  Novel  Cyclic  Algorithm  for  Maximum-Likelihood  Frequency  Estimation,”  IEEE  International 
Conference  on  Systems  Engineering,  Dayton,  OH,  Aug.,  1991. 

[10]  D.  W.  Tufts  and  R.  Kumaresan  “Frequency  Estimation  of  Multiple  Sinusoids  :  Making  Linear  Prediction 
Perform  Like  Maximum  Likelihood,”  Proceedings  of  the  IEEE,  vol.  70,  pp.  975-989.  Sept.,  1982. 

[11]  L.  L.  Scharf,  Statistical  Signal  Processing  -  Detection,  Estimation  and  Time  Series  Analysis,  Addison- Wesley, 
Reading,  MA,  1990. 

[12]  J.  B.  Y.  Tsui  and  D.  Sharpin,  Unpublished  Report  on  Time-Domain  Detection  for  Digital  Receivers,  June, 
1992. 

[13]  J.  V.  DiFranco  and  W.  L.  Rubin,  Radar  Detection,  Artech  House,  Inc.,  Dedham,  MA. 

[14]  J.  I.  Marcum,  “A  Statistical  Theory  of  Target  Detection  by  Pulsed  Radar,”  Trans.  IRE  Prof  Group  on 
Information  Theory,  IT-6;  vol.  2,  pp.  59-267,  April,  1960. 

[15]  G.  A.  Campbell  and  R.  M.  Foster,  Fourier  Integrals  for  Practical  Applications,  Van  Nostrand,  Princeton, 
NJ,  1948. 

[16]  W.  C.  Knight,  R.  G.  Pridham  and  S.  M.  Kay,  “Digital  Signal  Processing  of  Sonar,”  Proceedings  of  the  IEEE, 
vol.  69,  no.  11,  Nov.  1981. 

[17]  M.  Schwartz  and  L.  Shaw,  Signal  Processing  :  Discrete  Spectral  Analysis,  Detection,  and  Estimation, 
McGraw-Hill,  New  York,  1975. 


69 


Section  -  2.6  :  PIPELINED-ADAPTIVE  TRACKING  OF  MULTIPLE  SINUSOIDAL  FREQUENCIES 

Summary 

New  Pipelined-Adaptive  algorithms  are  proposed  for  tracking  multiple  Frequencies  or  Angles-of-Arrival 
(AOA)  of  moving  targets.  Pipelining  of  adaptive  filters  pose  a  critical  challenge  because  of  the  timing  mismatch 
arising  from  the  feedback  signals.  In  this  paper,  some  relaxation  techniques  [9]  will  be  utilized  to  pipeline  adap¬ 
tive  algorithms  for  high-speed  tracking  of  frequency /AO As.  Two  adaptive  tracking  algorithms  have  been  mapped 
into  pipelined  forms,  namely  Least-Mean  Squares  (LMS)  [3]  and  Recursive  Least-Squares  (RLS).  Preliminary 
simulation  studies  with  multiple  sources  indicate  encouraging  results. 

I.  Introduction  :  Pipelined  data-adaptive  algorithms  are  presented  for  passive  high-speed  tracking  of  multiple 
targets.  In  non-stationary  environment  or  when  target  locations  change  with  time,  block-mode  processing  of 
observation  data  is  inappropriate  while  adaptive  algorithms  are  more  preferable.  Various  adaptive  algorithms 
addressing  this  problem  exist  [3,5]  but  the  throughput  rate  of  these  algorithms  are  limited  by  usually  long  critical 
paths  of  the  adaptive  filters.  Critical  paths  can  be  reduced  by  pipelining  which  is  usually  accomplished  by 
introducing  appropriate  latches  at  intermediate  stages  to  divide  the  critical  path  into  multiple  disjoint  sections. 
Pipelining  allows  higher  sampling  rate  and  throughput  essential  in  many  radar  applications  such  as  in  digital 
microwave  receivers  [11].  However,  pipelining  of  adaptive  filters  pose  additional  challenge  due  to  timing  mismatch 
produced  by  feedback  signals  [9].  Recently,  some  relaxation  techniques  have  been  found  to  be  effective  in  pipelining 
certain  adaptive  algorithms  for  coding  and  communication  applications  [9].  In  this  work,  we  study  the  effectiveness 
of  relaxations  for  pipelining  adaptive  tracking  algorithms. 

It  may  be  noted  that  various  adaptive  algorithms  for  tracking  multiple  targets  do  exist,  including  LMS  [3], 
gradient  adaptive  lattice  (GAL)  [3],  least  squares  lattice  (LSL)  [3],  Recursive  Least  Squares  (RLS)  and  QR-based 
adaptive  tracking  algorithms  [5].  However,  to  the  best  of  our  knowledge,  none  of  these  adaptive  frequency  tracking 
algorithms  have  been  implemented  or  studied  in  pipelined  forms.  Here  we  present  the  results  of  relaxation-based 
pipelining  on  LMS  and  RLS  based  tracking  algorithms.  Research  on  pipelining  of  the  other  tracking  algorithms  is 
being  conducted  and  will  be  reported  later.  It  may  be  emphasized  here  that  pipelining  will  not  only  be  beneficial 
for  speeding  up  adaptive  tracking,  recent  studies  indicate  that  pipelining  can  be  also  effective  for  reduction  of 
both  power  consumption  [1]  and  chip-area  [8]  using  appropriate  folding  techniques. 

II.  Look-Ahead  Pipelining  (LAP)  :  Consider  the  first  order  recursive  equation  given  by. 


y{n  +  1)  =  ay{n)  +  a:(n). 

(1) 

The  corresponding  transfer  function  is  given  by, 

z~^ 

H{z)  =  - - ^ 

1  —  az  ^ 

(16) 

By  applying  an  M-step  look-ahead  using  back-substitution, 

M-l 

t/(n)  =  a^y{n  —  M)  -h  ^  a^x{n  —  ^  —  1). 

s=0 

(2) 

It  can  be  easily  shown  that  the  transfer  function  corresponding  to  both  (1)  and  (2)  are  identical.  Note  that  y{n) 
no  longer  depends  on  the  previous  output  sample  y{n  —  1)  but  on  an  output  that  is  M  samples  back  in  time,  i.e., 
y{n  —  M).  Hence,  the  immediate  dependence  problem  has  been  removed  [4,  6],  Le.,  the  signals  can  be  sampled 
more  often  or  there  is  more  time  for  computation.  This  implies  that  the  throughput  of  the  logic  unit  is  increased 
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by  a  factor  of  M  leading  to  high-speed  implementation.  This  technique  is  referred  to  as  time-domain  [6,  10] 
or  Clustered  look-ahead  pipelining  [7]  Note  also  that  speed-up  in  throughput  is  achieved  at  the  cost  of  higher 
hardware  complexity  due  to  the  overhead  term  in  (2). 

III.  Relaxation  Techniques  :  The  above  LAP  scheme  maintains  exact  equivalence  in  the  transfer  function. 
However,  the  second  term  in  (2)  is  an  overhead  term  and  from  the  standpoint  of  hardware,  exact  computation  of 
this  overhead  term  may  be  impractical,  particularly  for  real-time  adaptive  implementations.  It  has  been  shown  in 
[9]  for  a  variety  of  coding  and  communication  related  applications,  that  these  overhead  terms  can  be  approximated 
under  certain  circumstances. 

III.l.  Sum  Relaxation  :  In  (2),  if  a  «  1,  and  if  j:(n)  remains  approximately  constant  over  M  clock  cycles,  then 
we  can  replace  the  overhead  term  in  y(n  M)  by  [9], 

y{n  -b  M)  =  a^y{n)  +  Mx{n)  (3) 


III. 2.  Product  Relaxation  :  When  a  is  time- varying  (represented  more  appropriately  by  a(n)),  but  its  magnitude 
is  close  to  unity,  then  a(n)  can  be  written  as  (1  -  a'(n))  where  a'(n)  is  close  to  zero.  The  equation  for  y{n  -|-  M) 
can  be  approximated  as  [9], 

M-l 

y{n  -b  M)  =  (1  -  Ma'{n))y{n)  +  ^  a’(n)x(n  +  M  -1-i).  (4) 

i=0 


JV.  Adaptive  EVequency  Tracking  using  Pipelined  LhdS  Adaptive  Filters  ;  Consider  the  un  pipelined 
LMS  algorithm  which  is  referred  to  as  the  ‘seria/’ LMS  (SLMS)  algorithm 


Ki^)  =  (1  -  aLMS)^o(«  -  1)  +  «LMS“^(") 
^  '  pEtin) 

W(n)  =  W(n  -  1)  +  /i(n)e(n)U(n) 


(5) 

(6) 

(7) 

(8) 


where,  0  <  Q^lMS  ^  ^  used,  p  is  the  order  of  the  transversal  filter  and  El{n)  is  the  power  of  the  signal 

samples  within  the  tracking  window  [3].  By  applying  Af-step  look-ahead  to  equations  (5)  and  (7),  we  have 


«LMS  51  (1  “  «LMS)‘“^("  “  *)  (9) 

M-l 

W(n)  =  W(n  -  M)  +  n{n)  ^  e(n  -  i)U(n  -  i). 


M-l 

E' 

i=0 


t=0 


(10) 


This  introduces  M  latches  in  the  recursive  loops  which  may  be  redistributed  to  pipeline  the  feedback  multiply-add 
operation  by  Af  levels.  By  substituting  the  equation  (10)  into  equation  (8)  we  obtain  the  error  equation 


e(n)  =  d{n)  — 


y{n)  ^  e(n  -  i  -  l)U(n  -  i  -  1) 

t=0 


U(n)  -  W’’(n  -  Af  -  l)U(n). 


(11) 


Clearly,  the  number  of  overhead  terms  after  applying  look-ahead  is  rather  high.  By  applying  the  sum  relaxation 
to  equation  (11)  and  replacing  W(n  -  Af  -  1)  by  W(n  -  Af)  we  can  approximate  equation  (11)  as. 


e(n)  =  d{n)  -  W’’(n  -  M)U(n). 


(12) 
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The  sum  relaxation  is  applied  assuming  that  /i(n)  is  relatively  small  and  hence  the  second  term  in  equation  (11) 
does  not  have  a  dominating  effect  on  the  error  calculation.  We  also  approximate  equation  (9)  to 


S»  =  (l-aLMS)^ 


M-1 


Kin  -M)  +  aLMS  “  *) 

*=o 


(13) 


Equations  (6),  (10),  (12)  and  (13)  constitute  the  relaxed  look-ahead  pipelined  LMS  algorithm  (PLMS).  It  may 
be  noted  here  that  the  PLMS  version  given  in  [9]  does  not  make  use  of  or  pipeline  this  adaptive  error  calculation 
given  in  (13)  which  has  been  shown  to  be  convenient  in  adaptive  tracking  [3]. 

Note  that  the  hardware  complexity  after  relaxed  look-ahead  pipelining  has  increased  by  {N  +  1)(M  —  1)  adders 
because  of  the  overhead  terms  in  (9)  and  (10).  The  architectures  of  the  SLMS  and  PLMS  filters  are  as  shown 
Fig.  la  and  Fig.  lb,  respectively.  By  comparing  the  critical  paths  of  the  two  architectures  we  see  that  by  proper 
distribution  of  the  extra  delays  introduced  by  pipelining,  the  pipelined  architecture  can  be  made  to  operate 
approximately  M  times  faster. 


V.  Adaptive  Frequency  Tracking  using  Pipelined  Recursive  Least-Squares  Algorithm  :  The  ‘serial’ 
recursive  least-squares  (SRLS)  algorithm  is  described  by  [2,9]  : 


k(n) 


A  ^P(n  —  l)u(n) 

1  -(-  A’’^u^(n)P(n  -  l)u(n) 


a{n)  =  d{n)  -  W^(n  -  l)u(n); 


W(n)  =  W(n  -  1)  +  k(n)a(n); 

P(n)  =  A“^  [P(n  -  1)  -  k(n)u^(n)P(n  -  1)]  ; 


where,  u'^(n)  =  [u(n),  u{n  —  1),  •  •  • ,  u{n  -  iV  -f  1)]  is  the  input  vector,  W'^(n)  =  [u^i(n),  •  •  • ,  ti;iv(w)]  is  the  vector 
of  weights,  d(n)  is  the  desired  signal,  k{n)  is  the  Kalman  filter  gain,  a(n)  is  the  error  and  P(n)  =:  </>"^(n),  (j>{n) 
being  the  deterministic  autocorrelation  matrix  of  the  input  signal.  A  13”=  o  =  A”“*u(i)u^(f).  This  RLS 

algorithm  solves  for  W (n)  in  the  following  normal  equations, 

<l>{n)W{n)  =  e{n) 

where,  the  cross- correlation  vector  0{n)  A  A’^~*d(i)u(2).  0{ri)  and  <I>{ti)  can  be  computed  recursively  as. 


^(n)  =  A^(n  —  1)  +  d(n)u(n); 

(^(n)  =  X<j){n  —  1)  +  u(n)u^(n) 

Relaxed  look-ahead  pipelining  is  applied  to  the  above  two  recursive  equations  to  obtain 

0[n)  —  \9(n  —  M)  -h  Lyid(n)u(n); 

<^(7i)  =  A<^(n  —  M)  -f  LAu(n)u^(n); 

where.  La  is  the  look-ahead  factor.  The  sum  and  product  relaxations  were  used  to  obtain  the  pipelined  equations. 
Using  these  equations  to  solve  for  W(n),  the  pipelined  RLS  (PRLS)  equations  can  be  re-derived  [9]  and  are  given 
as, 

A“^P(n  “  M)u(n) 

^  1  H-  A”^u^(n)P(n  —  M)u(n) 

a(n)  =  d{n)  —  W^(n  —  M)u(n) 

W(n)  =  W(n  —  M)  -h  k(n)a(n) 

P(n)  =  A~^  [P(n  —  M)  —  k(n)u^(n)P(n  —  M)] 
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The  architecture  for  both  SRLS  and  PELS  are  given  in  Fig.  2a  and  Fig.  2b,  respectively.  We  can  see  that  the 
hardware  complexity  is  almost  the  same  as  that  of  the  serial  algorithm  except  for  an  additional  2M  latches.  One 
set  of  M  latches  corresponds  to  that  required  to  pipeline  the  W-loop  and  the  other  to  pipeline  the  P-loop.  The 
M  latches  can  be  redistributed  within  the  architecture  so  as  to  maximize  throughput.  Furthermore,  by  employing 
the  folding  transformation  [8],  the  hardware  of  the  pipelined  algorithm  can  be  further  reduced, 

VI.  Simulation  Results  :  Several  simulations  have  been  conducted  to  verify  the  performance  of  the  pipelined 
adaptive  algorithm  in  tracking  time  varying  frequencies  at  various  noise  levels. 

Simulation  1  :  The  data  set  consists  of  2  real  sinusoids  of  0.4375  Hz  and  0.1250  Hz  which  undergo  a  step  change 
to  0.3750  Hz  and  0.0625  Hz  respectively.  The  signals  are  at  signal  to  noise  ratios  (SNRs)  of  20  dB  and  15  dB 
respectively.  Fig.  3a  and  3b  show  the  tracking  characteristics  of  the  SLMS  (when  M  =  1)  and  PLMS  (with  M 
=  3),  respectively,  keeping  the  new  parameter  at  a  =  0.04.  Fig.  4a  and  4b  show  the  corresponding  results  using 
SRLS  (with  M  =  1)  and  PELS  (with  Af  =  3),  respectively,  with  \  =  0.95.  Clearly,  in  both  cases  convergence  for 
the  target  with  higher  SNR  is  quicker  than  the  low  SNR  target.  Furthermore,  there  is  little  effect  on  convergence 
time  due  to  relaxations  used  in  pipelining  especially  in  the  case  of  the  PLMS.  In  case  of  PELS,  there  appears  to 
be  some  jitter  when  trying  to  keep  up  with  the  change  for  the  lower  SNR  target. 

Simulation  2  :  The  effectiveness  of  PLMS  and  PELS  in  tracking  time-varying  frequencies  has  also  been 

tested  by  letting  the  algorithm  track  two  sinusoidal  FM  signals  (/d  =  /c2  =  0.1250^7^:  and 

A/i  =  A/2  =  0. 062577 2r),  where  fc  and  A/  represent  center  frequency  and  peak  frequency  deviation  of  an 
FM  signal,  respectively.  Both  signals  are  at  20  dB.  The  simulations  are  shown  in  Fig.  5a  and  5b  for  the  SLMS 
case  (M  =  1)  and  the  PLMS  case  (M  =  3)  respectively  with  a  =  0.15.  Fig.  6a  and  6b  show  the  simulation 
results  for  the  SRLS  (M  =1)  and  PELS  (Af=3)  cases  respectively  with  A  =  0.7.  Again  the  PLMS  and  the  PELS 
show  minimal  convergence  degradation. 


^  1  .  lambda  —  0.7 
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Figure  6:  Adaptive  tracking  using  (a)  SRLS  (b)  PRLS  Figure  4:  (a)  SRLS  adaptive  Tracking  (b)  PRLS  adaptive  tracking 


73 


References 


[1]  A.  P.  Chand,  “Low  Power  CMOS  Digital  Design,”  IEEE  J.  of  Solid-State  Circuits,  vol.  27,  pp.  473-484, 
Apr.,  1992. 

[2]  S.  Haykin,  “Adaptive  filter  Theory”,  Englewood  Cliffs,  NJ:  Prentice-Hall,  2nd  Ed.,  1991. 

[3]  W.  S.  Hodgkiss,  JR.  and  J.  A.  Presley,  JR.,  “Adaptive  Tracking  of  Multiple  Sinusoids  Whose  Power  Levels 
are  Widely  Separated”,  IEEE  Trans.  Circuits  Syst,  vol.  CAS-28,  no.  6,  pp.  550-561,  June  1981. 

[4]  P.  M.  kogge  and  H.  S.  Stone,  ”  A  Parallel  Algorithm  for  the  Efficient  Solution  of  a  General  Class  of  Recurrence 
Equations”,  IEEE  Trans.  Comput.,  Vol.  C-22,  pp786-793,  Aug.  1973. 

[5]  Z.  S.  Liu,  “QR  Methods  of  0(77)  Complexity  in  Adaptive  Parameter  Estimation”,  IEEE  Trans.  Signal 
Processing,  vol.  43,  no.  3,  March  1995. 

[6]  H.H.  Loomis  and  B.  Sinha,  “High-Speed  Recursive  Digital  Filter  Realization”,  Circuits,  Systems  and  Signal 
Processing,  vol. 3,  pp.  267-294,  Sept.,  1984. 

[7]  K.K.  Parhi  and  D.G.  Messerschmitt,  “Pipelining  Interleaving  and  Parallelism  in  Recursive  digital  filters  - 
Part  I,  :  Pipelining  using  Scattered  Look-Ahead  and  Decomposition,”  IEEE  Trans,  on  Acoustics,  Speech 
and  Signal  Proc.,  vol.  37,  pp.  1099-1117,  July  1989. 

[8]  K.K.  Parhi  ,  C.Y.  Wang  and  A.P.  Brown,  “Synthesis  of  Control  Circuits  in  Folded  pipelined  DSP  architec¬ 
tures”,  IEEE  J.  of  Solid-State  Circuits,  vol.  27,  no.l,  pp.  29-43,  Jan.  1992. 

[9]  N.R.  Shanbhag  and  K.K.  Parhi,  Pipelined  Adaptive  Digital  Filters,  Kluwer  Academic  Publishers,  Boston, 
MA,  1994. 

[10]  M.  A.  Soderstrand,  K.  Chopper  and  B.  Sinha,  “Comparison  of  three  new  techniques  for  pipelining  HR  digital 
filters,”  Twenty-Third  ASILOMAR  Conference  on  Signals,  Systems  and  Computers,  Pacific  Grove,  CA,  pp. 
439-443,  Nov.,  1984. 

[11]  J.  B.  Y.  Tsui,  Digital  Microwave  Receivers  :  Theory  and  Applications,  Artech  House,  MA,  1989. 

[12]  J.  F.  Yang  and  H.  J.  Lin,  “Adaptive  High-Resolution  Algorithms  for  Tracking  Nonstationary  Sources,”  IEEE 
Transactions  on  Signal  Processing,  vol.  42,  no.  3,  Jan.,  1994. 


75 


CHAPTER  3 


SYSTEM  IDENTIFICATION  AND  HARWARE  IMPLEMENTATION  PROBLEMS 
Introduction 

The  rational  System  Identification  theory  is  closely  related  to  the  receiver  design  problem.  In  particular, 
Angle-of- Arrival  (AOA)  and  frequency  estimation  are  two  of  the  most  important  integral  parts  of  most  radar 
receivers  but  these  two  problems  can  be  addressed  as  special  cases  of  rational  system  identification  problems. 
Furthermore,  digital  EW  receivers  would  require  many  digital  filters  for  various  purposes  such  as,  anti-aliasing, 
image  suppression,  IF  and  etc..  Synthesis  of  digital  HR  filters  from  any  arbitrary  frequency  domain  specifications 
is  also  one  of  the  important  problems  addressed  by  the  proposed  system  identification  framework.  Identification 
of  unknown  discrete-time  linear  systems  is  a  fundamental  problem  in  signal  processing.  Among  many  available 
parametric  models,  pole-zero  or  rational  transfer  function  model  is  one  of  the  most  effective  and  practical  rep¬ 
resentations.  Optimal  estimation  of  rational  model  parameters  will  be  the  focus  of  this  part  of  the  report.  The 
system  identification  and  signal  analysis  problems  considered  here  are  fundamental  in  nature  and  the  results  are 
expected  to  have  impact  and  usefulness  in  a  wide  range  of  applications  including  EW  receiver  design. 

Applications  of  System  Identification  abound  in  Communication  systems,  Automatic  Control  systems. 
Aerospace  and  Mechanical  Systems,  Econometrics  and  many  other  fields.  Digital  filter  design  from  frequency 
and/or  time-domain  information  has  extensive  applications  in  speech  or  image  processing,  communication,  radar 
or  sonar  signal  processing,  bio-medical  signal  processing,  Digital  Instrumentation  and  Control  and  in  various 
other  fields.  Depending  on  the  application,  the  design  specifications  of  an  unknown  system  may  be  available 
or  prescribed  in  the  time-domain  (T-D)  as,  (i)  Impulse  Response  (IR)  or  (ii)  Input-Output  (10)  data,  and  in 
the  frequency-domain  (F-D)  as  (iii)  Frequency  Response  (FR)  data.  The  standard  synthesis  or  identification 
problem  is  to  estimate  the  numerator  and  denominator  polynomial  coefficients  that  match  the  prescribed  specs 
in  the  least-squares  (LS)  sense.  It  is  well-known  that  these  LS  problems  are  highly  non-linear.  Some  existing 
approaches  minimize  ‘equation  errors’  instead  of  the  true  fitting  errors  and  others  modify  or  linearize  the  true 
model-fitting  criteria  for  iterative  estimation  of  the  numerator  and  denominators  simultaneously. 

The  main  goal  in  this  part  of  the  work  is  to  exploit  certain  powerful  theoretical  results  in  Numerical  Analysis 
to  theoretically  decouple  the  multidimensional  nonlinear  criteria,  into  two  distinct  problems  :  (1)  a  purely  linear 
problem  for  estimating  the  numerator  and  (2)  a  non-linear  problem  for  estimating  the  denominator.  The  nonlinear 
part  is  then  reparameterized  by  invoking  results  on  projection  operators.  In  this  form,  the  denominator  criterion 
possesses  a  weighted  matrix  structure  which  is  convenient  for  iterative  optimization.  But  more  importantly, 
once  the  optimal  denominator  is  known,  the  optimal  numerator  is  found  with  only  a  single  step  of  linear  LS 
estimation.  Removal  of  the  numerator  estimation  from  the  iterative  process  reduces  computational  complexity 
when  compared  with  existing  simultaneous  estimators  in. 

The  theoretical  results  as  well  as  the  algorithmic  framework  we  propose  here  encompass  a  comprehensive 
class  of  system  identification  problems  in  time  and  frequency  domains.  This  important  underlying  common  theme 
appears  to  have  remained  unrecognized  and  un-utilized.  In  fact,  one  of  our  goal  is  to  establish  the  analogies  and 
equivalences  between  the  time-domain  and  frequency-domain  optimization  approaches  which  seem  to  have  evolved 
independently.  Our  hope  is  that  a  thorough  study  and  proper  understanding  of  these  equivalences  might  enable 
us  to  apply  and  exchange  useful  ideas  from  one  domain  to  the  other.  It  may  also  lead  to  combined  optimization 
in  the  frequency  and  time  domains  by  matching  the  desired  characteristics  in  both  domains  simultaneously. 

The  proposed  unified  framework  is  expected  to  provide  intuitive  and  useful  theoretical  insights  into  various 
time-domain  and  frequency- domain  identification  and  synthesis  problems.  For  example,  the  1-D  SISO  algorithms 
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can  be  extended  to  multi-dimensional  (m-D)  and  multi-input /multi-output  (MIMO)  problems  in  a  straight¬ 
forward  manner. 

Look-ahead  pipelining  has  been  found  to  be  very  effective  for  attaining  high  sampling  rate  and  high  compu¬ 
tation  speed  in  low-cost  VLSI  implementation  of  recursive  digital  filters.  The  well-known  Scattered  Look-ahead 
implementation  of  Recursive  HR  filters  achieves  stability  at  the  cost  of  increased  multiplication  and  latch  complex¬ 
ities  and  considerable  delay  in  output  generation.  The  clustered  look-ahead  approach  can  not  always  guarantee 
stability  [1],  We  present  a  new  scheme  (referred  to  as  distributed  look-ahead)  which  is  a  compromise  between 
the  two  existing  look-ahead  approaches.  The  proposed  scheme  appears  to  avoid  some  of  the  potential  drawbacks 
in  various  pipelined  implementations  of  recursive  filters.  Our  work  shows  that,  in  order  to  attain  stability,  the 
output  samples  need  not  be  clustered  or  equally  scattered.  Indeed,  in  many  filter  design  problems,  stability  can 
be  maintained  by  using  unequally  distributed  past  output  samples.  When  compared  with  the  scattered  approach, 
the  proposed  scheme  uses  fewer  number  of  pole-zero  cancelations  and  the  introduced  roots  are  not  necessarily  at 
the  same  radii  as  the  original  filter  poles.  Hence,  the  proposed  distributed  look-ahead  scheme  has  reduced  multi¬ 
plication  and  latch  complexities,  higher  area-efl5ciency  and  it  produces  outputs  with  reduced  delay.  The  proposed 
DLA  scheme  has  been  used  for  high-speed  implementation  of  both  1-D  and  2-D  Recursive  Digital  Filters. 

The  look-ahead  pipelined  recursive  filters  discussed  above  are  obtained  primarily  via  transformation  of  a 
given  un-pipelined  transfer  function.  For  these  approaches,  it  is  assumed  that  the  un-pipelined  transfer  function 
has  already  been  designed  as  an  intermediate  step.  In  this  project,  we  also  present  a  new  algorithm  (OM-LA) 
for  direct  and  optimal  estimation  of  the  coefficients  of  recursive  filters  in  look-ahead  pipelined  form.  OM-LA  is 
developed  by  appropriate  modification  of  a  recently  proposed  optimal  method  (OM)  for  designing  un-pipelined 
filters  (developed  previously  by  the  PI  as  part  of  a  project  supported  by  the  AFOSR).  It  is  demonstrated  that  the 
proposed  one-step  approximation  can  achieve  superior  match  with  reduced  pipelined  filter  order  because  it  does 
not  rely  on  pole-zero  cancelations  as  in  current  LA  pipelining  approaches.  It  is  also  shown  that  the  denominator 
polynomial  can  be  constrained  to  possess  any  of  the  possible  look-ahead  configurations.  Unlike  some  existing 
methods,  OM-LA  minimizes  the  true  time-domain  fitting  error-norm  between  the  prescribed  and  the  estimated 
impulse  response  and  produces  superior  results. 
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Section  -  3.1  :  IDENTIFICATION  OF  1-D  RATIONAL  SYSTEMS  FROM  iNPUT-OUTPUT  DATA 


Summary 


A  theoretical  and  algorithmic  framework  is  proposed  for  optimal  identification  of  rational  transfer  function 
parameters  of  discrete-time  linear  systems  from  Input-Output  (10)  data.  The  nonlinear  criterion  is  theoretically 
decoupled  into  a  purely  linear  problem  for  estimating  the  optimal  numerator  and  a  nonlinear  problem  for  the 
optimal  denominator.  The  proposed  decoupled  approach  has  reduced  computational  requirements  when  compared 
to  existing  algorithms  which  estimate  the  parameters  simultaneously. 


I.  Introduction  :  Identification  of  unknown  Linear  Time-Invariant  Discrete-Time  systems  is  a  critical  prob¬ 
lem  in  signal  processing  and  control  theory  [T13, 15-22].  This  work  addresses  the  problem  of  optimal  identification 
of  the  parameters  of  rational  transfer  functions  by  Least-Squares  (LS)  fitting  of  observed  input-output  sequences. 
Optimization  of  the  LS  criterion  for  this  problem  requires  multi-dimensional  nonlinear  optimization  [1,  2,  15-21]. 
Many  existing  algorithms  either  modify  or  linearize  the  true  nonlinear  error  criterion  to  estimate  the  unknown 
parameters  simultaneously.  This  work  will  demonstrate  that  the  optimal  rational  model  identification  problem 
belongs  to  a  special  class  of  mixed-nonlinear  optimization  framework  where  the  linear  and  nonlinear  variables 
separate  [14] .  The  true  nonlinear  criterion  will  be  theoretically  decoupled  into  : 

(i)  a  purely  linear  problem  for  obtaining  the  optimal  numerators  and 

(ii)  a  nonlinear  problem  of  reduced  dimensionality  for  determining  the  optimal  denominators. 

The  decoupled  criteria  retain  the  global  optima  of  the  original  criterion.  Only  the  criterion  for  the  denominator 
is  nonlinear  but  it  possesses  a  weighted-matrix  structure  which  is  utilized  for  minimizing  it  iteratively.  The 
optimal  numerator  is  estimated  in  one  step.  Hence,  unlike  some  existing  algorithms  which  estimate  both  sets  of 
parameters  iteratively  [2],  the  proposed  computational  algorithm  has  reduced  computational  requirements. 

II.  Problem  Formulation  :  Rational  transfer  function  representations  of  a  SISO  plant, 


_  a(0)  +  a(l)z  ^  +  ■  ■  ■  +  aiq)z-^ 

^  ^  ~  1  +  6(l)^-i  +  •  •  •  +  b{p)z-P  =  B(z)  ’ 

=  h(0)  +  h{l)z-^  +  ---  +  h{N-  +  •  •  ■ , 


(1) 

(2) 


where,  6(0)  =  1.  Fig.  1  depicts  what  is  commonly  known  as  the  output- error  model  of  a  plant,  where,  2/o(n)  and 
y{n)  denote  the  true  and  observed  (possibly  noisy)  output  signals,  respectively,  and  t;(n)  denotes  the  observation 
or  measurement  noise.  Let, 

X  A  [x(0)  x(l)  •••  x(iV'  — 1)]^  (3a) 

and 

y  A  [t/(0)  j/(l)  •  •  •  y{N  -  1)]^  (36) 

denote  the  vectors  containing  the  N  input  and  observed  samples,  respectively.  In  vector  form,  the  unknown  model 
parameters  are  defined  as, 

a  A  [a(0)  a(l)  •  •  •  a(?)]^  (4a) 

and 

b  A  [1  6(1)  •  •  •  b(p)f .  (46) 

The  problem  under  consideration  in  this  part  of  the  project  can  be  stated  as  follows  : 
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Given  the  observed  output  data  y  and  the  input  data  x,  estimate  the  optimal  model  parameters  a  and  b  by 
minimizing  the  following  LS  model-fitting  criterion  : 


min 

a,b 


N^l 

£  [2/(0  - 


(5) 


Regarding  methods  related  to  this  work,  Kalman  [1]  had  defined  an  equation  error  to  solve  this  problem  (KM), 
whereas  Steiglitz  and  McBride  (SMM)  iteratively  minimized  a  modified  error  criterion  [2]  to  estimate  both  sets 
of  parameters  simultaneously. 

III.  Proposed  Method  (OM-IO)  :  Let  Hb{z)  be  the  inverse  filter  corresponding  to  B{z),  i.e., 

B{z)Ht{z)  =  1.  (6) 

This  is  a  convolution  operation  and  hence,  in  matrix  notations, 

BjHi  =  I^,  (7) 


where,  In  denotes  a.n  N  x  N  identity  matrix;  Bj  and  are  convolution  matrices  formed  as, 

A  b{i-j), 

and 

Hj(i,  j)  A  Hi{i  -  j),  for,  i,j  =  I,...,  N 
Note  that  both  these  matrices  are  lower-triangular.  In  partitioned  form. 


Bj  = 


Hi  =  [H,|H,]. 


(8a) 

(86) 

(9) 


where,  B„  G  B  G  Hj  G  and  Hr  G  Using  (6)  and  assuming 

that  the  input  is  causal,  Le.,  x{n)  =  0,  for  n  <  0,  the  optimization  criterion  in  (5)  can  be  restated  as. 


N-l 

min  ^  ^y{i)  -  x{i)  *  hb{i) 

t=0 


(10) 


where,  *  denotes  the  convolution  operation.  In  matrix  notations  the  problem  is  equivalent  to  : 

min||e(a,b)|p  A  min||y  —  XH^ajp,  where,  (H®) 

a,b  =  a,b 

X{i,j)  Ax(i-j)  for,  i,j  =  l,...,N.  (116) 

This  is  a  mixed  optimization  problem  where  the  linear  and  nonlinear  variables  appear  separately.  If  Hj  (e.e.,  b) 
is  known,  then  the  linear  LS  estimate  of  the  numerator. 


a  A  (XH,)#y, 


(12) 


where,  (XH;)^  A  ((XHf)^(XHj))  ^(XHj)^  denotes  the  pseudo-inverse  of  (XH;).  Plugging  a  back  in  (11a), 
the  optimization  criterion  for  b  is  given  by, 


n:iin||e(b)|p  A  min||y  -  PxH,y||^  =  inin||[Ijv  -  PxH,]y||^ 

b  =  b  b 


(13) 
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where,  PxH,  A  XH,((XH,f  (XH,))->(XH,f ,  denotes  the  projection  matrix  of  (XH/).  In  (13),  the  parameters 
in  b  are  indirectly  related  to  the  error  criterion  in  a  rather  complicated  manner  through  PxHi  •  Next,  the  inherent 
matrix-structure  of  the  criterion  in  (13)  is  utilized  to  reparameterize  the  error  criterion  by  relating  it  directly  to 
the  coefficients  in  b. 

Let  Xj{z)  be  the  inverse  of  the  input  sequence  X{z)y  Le.,  X{z)Xi(z)  =  1.  Similar  to  (6)  and  (7),  in  matrix 
notation, 

X/X  =  In,  (14a) 

where,  X/  e  is  also  a  lower  triangular  matrix  defined  as, 


X/  A  for,  i,j  - 


(146) 


For  finite  N,  this  inverse  exists  as  long  as  the  first  element  of  the  input  sequence  is  non-zero,  i.e.,  a;(0)  ^  0.  This 
is  not  a  major  restriction  for  the  causal  systems  under  consideration  in  this  work  because  the  output  will  have 
non-zero  leading  values  only  when  there  is  non-zero  input.  But  it  would  be  desirable  that  X(z)  be  minimum- 
phase,  otherwise  Xi{z)  may  be  unbounded  for  some  values  of  z  which  in  turn  may  result  in  very  high  magnitudes 
of  X[{n)  for  large  N.  Combining  (7)  and  (14)  and  using  the  partitioned  forms  of  (9), 


BiX/XHt  =  In 


X/X[H,|Hr] 


(15a) 


BjX/XH, 

BjX/XH/ 

0(,+i)x(Ar-}-i)  j 

B^’X/XH; 

_ 

1  B^X/XHr 

.  0(JV-,-i)x(«+i) 

- 1 

1 

1 

X 

7 

1 

(156) 


The  bottom-left  corner  element  shows  that  the  N  x{N  —  q  —  1)  matrix  Xj  B  and  the  N  x(q  +  l)  matrix  XH;  are 
orthogonal,  i.e.,  (B^X/)(XH;)  =  0(//_g_i)x(j+i).  By  construction,  rank{XjB)  +  rank{XH.i)  =  N.  Hence, 
according  to  a  property  of  projection  matrices, 


PxjB  +  PxHi  =  liV. 


Using  this  result  in  (13),  the  following  re]?arame^merf  optimization  criterion  is  obtained, 

min  ||Px-Byir  =  min  ||Xf B(B^X/XfB)-iB^X/y||2 

b  4  b 

=  min  y^XjB(B’’X/X|’B)-^B^X/y. 
b 

In  order  to  obtain  an  expression  more  convenient  for  optimization,  define, 

z  A  X/y. 


It  can  be  easily  shown  that. 


B^z  A  Zb, 


where,  the  matrix  Z  is  constructed  with  the  elements  of  z  as. 


z{q  -b  1) 

z(<l) 

•••  2(0)  0  ••• 

0 

z{q  -b  2) 

z{q  + 1) 

O 

0 

Z  A 

zip) 

zip  - 1) 

2(0) 

_z{N-l) 

ziN  -  2) 

1 

1 

1 _ 

(16) 

(17a) 

(176) 

(18) 

(19a) 

(196) 
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(20) 


Using  (19a)  into  (17),  the  optimization  criterion  can  be  re-written  as, 

min  b^Z’’(B^X/Xj’B)-^Zb. 
b 

Equation  (12)  and  the  reparametrized  criterion  in  (20)  are  the  final  decoupled  forms.  It  should  be  emphcisized 
here  that,  thus  far,  the  theoretical  derivations  are  mathematically  exact,  i.e.,  no  linearization,  approximation  or 
modification  have  been  introduced  at  the  outset. 

According  to  the  Theorem  stated  in  the  Appendix  [14],  if  b  is  estimated  by  minimizing  the  criterion  in  (20) 
and  if  that  estimate  is  utilized  for  computing  a  using  (12),  then  the  resulting  estimates  are  the  unique  and  global 
minimizers  of  the  original  criterion  in  (5)  or  (11).  Furthermore,  once  the  optimal  b  is  known,  the  estimation  of 
the  optimal  a  in  (12)  is  a  linear  problem.  But  more  importantly,  a  needs  be  computed  only  once. 

Algorithm  :  The  nonlinear  criterion  in  (20)  appears  to  be  a  weighted  quadratic  in  the  unknown  vector  b. 
But  the  weight  matrix  (B^X/Xf  B)*”^  itself  is  dependent  on  the  unknowns  in  B.  The  computational  algorithm 
exploits  this  weighted  quadratic  structure  of  the  criterion.  At  fc-th  iteration,  the  algorithm  minimizes, 

min  b^[Z^(B’’^*"^^X/Xj’B(*’-i))-iZ]b,  (21) 

b 

where,  is  formed  by  using  the  estimate  of  b  obtained  at  the  previous  iteration.  b(°)  A  [1  0  •  •  *0]^  can 

be  used  as  the  initial  estimate  of  b  to  start  the  iterative  process.  Otherwise,  the  initial  estimates  could  also  be 
found  by  setting  the  middle  matrix  (B^X/Xf  B)“^  to  identity,  i.e.,  by  optimizing, 

min  b^  Zb .  (22) 

b 

To  ensure  non-trivial  solutions,  6(0)  is  set  to  unity.  Once  the  iterations  converge,  the  estimated  b  is  used  in  (12) 
to  linearly  estimate  the  numerator  coefficient  vector  a. 

On  the  Relationships  with  Other  Methods  :  The  proposed  theoretical  and  algorithmic  framework  appears 
to  be  the  most  general  one  in  its  own  class  of  1-D  deterministic  rational  System  Identification  (SID)  algorithms. 
In  fact,  a  large  body  of  work  on  SID  can  be  formulated  as  special  cases  of  OM-IO.  For  example,  in  case  of  Impulse 
Response  (IR)  fitting,  i.e.,  when  x{n)  ^  8{n)  and  y{n)  A  hd{n),  the  desired  IR,  an  optimal  method  (OM)  has 
been  developed  recently  [8,  9].  The  work  in  this  part  of  the  project  may  be  considered  to  be  a  further  generalization 
OM.  The  Evans-Fischl  Method  (EFM)  [5]  was  an  early  precursor  of  OM.  But  EFM  dealt  only  with  the  IR  fitting 
problem  and  it  is  applicable  only  for  the  strictly-proper  csise,  z.e.,  when,  p  =  q  1.  Furthermore,  the  recently 
proposed  Maximum-Likelihood  Method  for  exponential  modeling  (known  as,  KiSS  or  IQML)  is  basically  a  complex 
version  of  EFM  with  conjugate-symmetry  constraints  imposed  on  the  B(z)  coefficients  [6,  7].  Hence,  KiSS/IQML 
is  also  an  important  special  case  of  OM-IO.  Furthermore,  when  p  =  g  +  1,  the  initialization  step  of  OM  is  identical 
to  Prony’s  Method  [10]  or  Covariance  Method  of  Linear  Prediction  [11,  13].  For  general  cases.  Shanks  [3]  and 
Burrus-Parks  [4]  also  estimated  the  denominator  using  the  initialization  step  of  OM.  For  numerator,  the  linear 
estimator  in  (12)  was  used  by  Shanks  whereas  Burrus-Parks  used,  a{k)  =  Yli=o  K'^)hd{k  -  i)y  for  Ar  =  0, 1, . . . ,  g. 
Finally,  the  formulation  presented  in  here  appears  to  be  quite  well-suited  for  deconvolution  [22].  Specifically,  if 
the  output  and  the  Channel  IR  (or,  alternately,  the  estimates  of  a  and  b)  are  available,  then  the  criterion  in  (11) 
can  be  appropriately  modified  to  obtain  an  LS  or  MLE  of  the  unknown  input  vector  x. 

IV  :  Simulation  Results  :  In  all  figures,  the  true  and  modeled  impulse  responses  are  shown  in  solid  and 
dotted  lines,  respectively. 

Simulation  1  :  In  this  case,  white  noise  was  passed  through  an  ARMA(7,3)  system  with  an  arbitrary  impulse 
response.  The  output  was  corrupted  with  uncorrelated  white  noise.  The  first  V  =  30  input  and  output  samples 
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were  collected  for  identifying  the  system.  True  model  orders  were  used  for  identifying  the  system.  The  results 
with  30dB  and  15dB  SNR  values  are  shown  in  Fig.  2A  and  2B,  respectively. 

Simulation  2  -  Model  Reduction  :  For  the  same  data  sets  of  Simulation-1  an  ARMA(5,3)  model  was  used  for 
identifying  the  system.  Note  that  the  denominator  order  is  less  than  the  true  order  in  this  case.  The  results  with 
30dB  and  15dB  SNR  values  are  shown  in  Fig.  3A  and  3B,  respectively. 

The  results  of  Simulation- 1  indicate  that  the  proposed  algorithm  is  able  to  match  the  unknown  model  impulse 
response  very  closely  by  minimizing  the  output  error  norm.  Simulation-2  demonstrates  that  the  algorithm  also 
has  the  capability  of  obtaining  reduced  order  models  with  good  fit. 

Number  of  Iterations  for  Convergence  and  CPU  Times  :  For  30dB  SNR,  the  number  of  iterations  for 
convergence  for  actual  and  reduced  order  cases  were  found  to  be  8  and  5,  respectively.  The  iterations  were 
terminated  in  both  cases  when  ||b,+i  -  b,||^  <  10“^  was  achieved  in  each  case.  The  corresponding  CPU  times 
on  VAX-8550  were  3.0  and  2.59  seconds,  respectively.  Similar  differences  in  performance  were  found  for  other 
SNR  values  also.  In  general,  the  algorithm  showed  rapid  convergence  in  all  simulations  performed.  But  if  the 
unknown  system  is  non-minimum  phase  or  if  the  SNR  in  the  output  data  is  too  low ,  the  algorithm  may  converge 
to  a  suboptimum  or  the  estimates  may  oscillate.  In  order  to  guarantee  convergence,  the  proposed  iterative 
transformation  must  be  a  contraction  mapping.  This  may  be  difficult  to  demonstrate  in  general  for  any  arbitrary 
input-output  data  set.  Theoretical  analysis  of  the  convergence  properties  of  the  iterative  algorithm  needs  to  be 
performed. 
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Appendix  :  Optimality  Properties  of  the  Separate  Estimators 

THEOREM  -  (Adapted  from  Theorem  2.1  in  [14])  :  If  b  is  a  global  minimizer  of  ||e(b)||^  in  (13)  and  a  is 
estimated  using  that  b  as  in  (12),  i.e., 

a  A  (XH,)#y,  (A.l) 

where,  H;  is  formed  using  b,  then  ||e(a,b)|p  is  a  global  minimizer  of  ||e(a, b)|p  and  ||e(a, b)|p  =  ||e(b)|p. 
Conversely,  if  (a,b)  is  a  global  minimizer  of  ||e(a,b)|p,  then  b  is  a  global  minimizer  of  ||e(b)|p  and 
||e(b)|p  =  ||e(a,b)|p.  Finally,  if  there  is  an  unique  a  among  all  possible  minimizing  pairs  of  ||e(a,b)|p,  then  a 
must  satisfy  (A.l). 

PROOF  :  See  [14], 
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•  I.  The  Output  Error  Model  Structure 


IMPULSE  RESPONSE  •  IMPULSE  RESPONSE 


0.0  Z.9  9.0  7.0  10.0  1Z.9  19.0  17.S  20.0  2Z.9  23.0  27.9  30.0 

NUMBER  OF  SAMPLES 


0.0  2,9  3.0  7.9  10.0  .12.3  13.0  t7J  20.0  22J  23.0  27J  30  0 

NUMBER  OP  SAMPLES 


2:  Estimited  ImpiJse  Response  with  output  SNR  values  of  (A)  30dB  and  (B1  1  MR  T  u  . 
order  :  ARMA(7.3)  and  assumed  model-order  :  ARMA(7?3)  ‘ 
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0.0  i.5  5.0  M  <0.0  Its  iTo  ir.s  50.0  zij  zi.o  lij  35,0 
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Fig.  3:  "ill.  output  SNR  vidues  of  (A)  30dB  and  (B)  15dB.  line  modd- 

.  and  assumed  model-order  :  ARMA(5,3). 


Section  -  3.2  :  IDENTIFICATION  OF  1-D  RATIONAL  SYSTEMS  IN  THE  FREQUENCY  DOMAIN 

Summary 

A  new  Frequency-Domain  (FD)  approach  is  presented  for  optimal  estimation  of  rational  transfer  functions 
coefficients.  The  proposed  method  seeks  to  match  any  arbitrarily-shaped  FD  specifications  in  the  Least-Squares 
(LS)  sense.  The  desired  specifications  may  be  arbitrarily  spaced  in  frequency.  The  design  is  performed  directly 
in  the  digital  domain  and  no  analog  to  digital  transformation  is  necessary.  The  proposed  method  makes  use  of 
the  inherent  mathematical  structure  in  this  rational  modeling  problem  to  theoretically  decouple  the  numerator 
and  denominator  estimation  problems  into  two  smaller  dimensional  problems.  The  denominator  criterion  is 
nonlinear  but  possesses  a  weighted-quadratic  structure  which  is  convenient  for  iterative  optimization.  The  optimal 
numerator  is  found  linearly  by  solving  a  set  of  simultaneous  equations.  The  decoupled  criteria  retain  the  global 
optimality  properties.  The  performance  of  the  algorithm  is  demonstrated  with  some  simulation  examples. 

I  :  Introduction 

Traditionally,  digital  filters  are  designed  by  performing  Impulse  Invariance  or  Bilinear  transformation  on  available 
analog  designs.  Classical  analog  designs  utilize  polynomial  approximations  to  match  standard  filter  shapes  such 
as,  Low-Pass,  High-Pass  etc.  [9,  10].  An  obvious  drawback  of  classical  analog  design  techniques  is  that  filters 
with  arbitrary  or  non-classical  specifications,  as  in  case  of  a  notch  filter,  can  not  be  obtained.  In  this  part  of 
the  report,  a  direct  method  for  frequency-domain  design  of  digital  HR  filters  is  proposed.  The  method  seeks  to 
match  a  desired  frequency  response  with  any  arbitrary  shape  by  minimizing  the  optimal  LS  fitting  error  criterion. 
The  LS  criterion  for  this  problem  involves  multi-dimensional  nonlinear  search  and  several  linearized  or  modified 
approaches  have  been  developed  [2,  3,  21,  31].  There  have  been  some  ad-hoc  attempts  on  designing  digital  filters 
with  special  shapes  [9,  12].  Frequency  domain  version  of  Prony’s  algorithm  has  also  been  presented  recently 
[14,  15,  19,  25].  But  it  appears  that  the  underlying  mathematical  structure  inherent  in  this  rational  modeling 
problem  have  not  been  fully  exploited.  In  this  work,  the  frequency- domain  least-squares  problem  is  formulated  by 
identifying  the  orthogonal  projection  space  which  is  shown  to  be  formed  entirely  by  the  denominator  parameters. 
The  optimal  denominator  is  estimated  by  minimizing  the  exact  projection  space  which  is  independent  of  the 
numerator  coefficients.  The  optimal  numerator  estimation  problem  turns  out  to  be  a  simple  linear  LS  problem. 

It  is  demonstrated  in  this  work  that  the  optimal  rational  identification  problem  in  the  frequency-domain 
belongs  to  a  special  class  of  mzxed-nonlinear  optimization  framework  where  the  linear  and  nonlinear  variables 
separate  [13].  It  is  further  shown  that  the  true  nonlinear  criterion  can  be  decoupled  into  : 

(i)  a  purely  linear  problem  for  obtaining  the  optimal  numerator  coefficients  and 

(ii)  a  nonlinear  problem  of  reduced  dimensionality  for  determining  the  optimal  denominator  coefficients. 

This  important  underlying  theoretical  and  algorithmic  aspects  of  designing  digital  filters  in  frequency-domain, 
appears  to  have  remained  mostly  un-utilized.  After  decoupling,  the  denominator  criterion  possesses  a  convenient 
weighted-matrix  structure  which  is  then  utilized  to  develop  an  iterative  minimization  algorithm.  Once  the  de¬ 
nominator  is  estimated,  the  optimal  numerator  is  found  only  once  with  linear  LS.  The  decoupled  criteria  retain 
the  global  optima  of  the  original  criterion.  The  proposed  approach  is  closely  related  to  some  time-domain  results 
developed  recently  by  the  present  author  [8,  16,  17].  The  design  methodology  described  here  will  be  based  on 
matching  desired  Discrete-Time- Fourier-Transform  (DTFT)  values  which  may  be  arbitrarily  spaced  in  frequency. 
But  the  algorithm  can  be  easily  modified  if  the  desired  specifications  are  available  in  the  form  of  DFT  values. 

The  Section  is  arranged  as  follows  :  In  Subsection  II,  the  rational  transfer  model  is  defined  and  the  frequency- 
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domain  identification  problem  is  stated.  In  Subsection  III,  some  existing  methods  addressing  this  problem  are 
briefly  outlined.  The  details  of  the  proposed  decoupled  solution  is  presented  in  Subsection  IV.  Some  simulation 
examples  are  given  in  Subsection  V  to  demonstrate  the  performance  of  the  proposed  approach. 

II  :  The  Rational  Transfer  Function  Model  and  The  Frequency-domain  Design  Problem 


An  ARMA(p,  q)  digital  filter  can  be  modeled  as  : 

H{z)  -  ^h{i)z  -  +  =  D{z)' 


Let, 

h  A  [h(0)  h{l)  ■■■  h{N- 1)]^ . 
be  the  vector  with  the  first  N  significant  samples  of  H{z)  and 

a  A  [ao  oi  •  •  •  and 

b  A  [1  6i  •  •  •  bpf 


(1) 

(2a) 

(26) 

(2c) 


be  the  numerator  and  denominator  coefficient  vectors,  respectively. 

Let  Hd(z)  represent  the  desired  HR  filter  which  needs  to  be  modeled  as  H{z)  in  (1).  Using  the  notations  of 
equation  (1),  let  Hd{uk),  N{uk)  and  D{wk)  be  defined  as  the  frequency  response  values  of  Hd{z),  N{z)  and  D{z), 
respectively,  at  z  =  e^*"*.  The  frequency-domain  identification  problem  can  be  stated  as  follows  : 

Given,  Hd{uk),  at  ik  =  0, 1, 2, . . . ,  iV  -  1,  the  desired  frequency  response  values  (possibly  arbitrarily  spaced), 
estimate  the  parameters  in  N{wk)  and  Z)(wfc)  by  optimizing  the  following  LS  error  criterion  : 


min||ea,||^  A  min 

a,b  =  a,b 


iV-l 


i=0 


NjWi)  2 

D{wi)  • 


(3) 


III  :  Some  Existing  Frequency-Domain  Direct  Design  Methods 

The  problem  stated  in  (3)  is  a  nonlinear  optimization  problem  and  standard  nonlinear  optimization  schemes  can 
be  used  [7,  11].  But  these  generic  algorithms  are  known  to  be  sensitive  to  initial  choice  of  estimates  and  they 
do  not  specifically  make  use  of  the  unique  mathematical  structures  inherent  in  this  problem.  Some  linearized 
methods  that  specifically  address  the  design  problem  stated  in  (3),  have  also  been  proposed  [2,3].  More  recently, 
a  decoupled  algorithm  that  utilizes  divided-differences  and  Newton-Raphson,  has  been  reported  in  [14,  28].  In 
order  to  motivate  the  proposed  algorithmic  framework,  brief  outlines  of  some  of  the  direct  FD  design  methods 
are  given  next. 

III.l  :  Levy’s  Method  (LM) 

The  following  criterion  was  proposed  by  Levy  [2]  as  a  frequency-domain  counterpart  of  Kalman’s  original  work 
in  the  time-domain  [1]  : 

W-l  |2 

minjjeLMlI^  A  min  E  •  (4) 

’  i=0 

Note  that  the  original  error  criterion  in  (3)  is  modified  in  Levy’s  case.  Apart  from  the  obvious  advantage  of 
single-step  linear  solution,  this  algorithm  does  not  possess  any  other  optimality  properties.  It  may  also  be  noted 


87 


that  Kalman/Levy- type  approaches  for  the  ARMA  problem  are  closely  related  to  Levinson's  work  on  the  all-pole 
problem  [18],  where  only  the  first  term  of  the  error  criterion  was  minimized.  The  AR  parameter  estimation  work 
is  further  related  to  Prony’s  method  [19]  and  Fade  Approximation  [20].  Similar  error  criterions  for  the  ARMA 
problem  have  been  later  rediscovered  [21]  and  analyzed  [22]. 


III.2  ;  Sanathanan-Koerner’s  Prefiltering  Method  (SKM) 


The  earliest  work  that  most  closely  approximates  the  true  LS  fitting-error  criterion,  appears  to  be  due  to 
Sanathanan  and  Koerner  [3].  Their  goal  was  to  improve  upon  Levy's  work  which  did  not  really  attempt  to 
optimize  the  true  criterion  in  (3).  In  this  case,  an  initial  estimate  of  the  denominator  coefficients,  D^^\ujo)  is  first 
obtained  by  minimizing  Levy's  criterion  in  (4)  and  then  the  following  modified  fitting  error  criterion  is  optimized 
at  the  ^-th  iteration  [3], 


N-l 

min||e5i^||^  A  min 

a.b  =  a.b  7-^ 

t=:0 


Njwi)  2 


(5) 


where,  denotes  the  denominator  estimate  at  the  previous  iteration  which  is  used  as  a  prefilter  for 

obtaining  the  estimates  at  the  following  iteration  step.  Note  that,  (5)  closely  approximates  (3)  and  both  are 
identical  if,  D{uji)  =  But  using  (5),  the  unknown  parameters  in  a  and  b  can  be  estimated  simultane¬ 

ously  by  solving  a  set  of  linear  equations.  A  time-domain  counterpart  of  Sanathanan- Koerner 's  method  was  later 
discovered  independently  by  Steiglitz  and  McBride  in  [4],  though  the  later  work  is  definitely  more  well-recognized 
in  Signal  Processing  and  System  Identification  literature  [9,  10,  23,  24]. 


III.3  :  Kumaresan’s  Decoupled  Method  -  Generalized  (KM-G) 


The  Frequency- Domain  error  criterion  in  (3)  has  been  recently  decoupled  by  Kumaresan  in  [14,  15,  25,  28],  where 
divided- difference  matrices  [26]  have  been  utilized.  Similar  to  a  time-domain  decoupled  algorithm  due  to  Evans 
and  Fischl  (EFM)  [6],  this  approach  was  originally  proposed  for  strictly-proper  cases,  2.e,,  when,  p  =  g  -h  1. 
In  the  brief  outline  given  below,  appropriate  modifications  have  been  introduced  in  order  to  generalize  KM  for 
any  arbitrary  numerator  and  denominator  orders.  For  g-th  order  numerator  and  p-th  order  denominator,  the 
decoupled  criterion  for  estimating  the  optimal  denominator  is  : 

minh^^C^(CC^)-^Ch^  (6) 

b 

where, 

A  H.icji)  •••  Hd{ojN-i)f  (7a) 

denotes  the  vector  containing  the  N  samples  of  the  prescribed  frequency  response  data, 
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(76) 

(7c) 


(7d) 


Wo 

1 


Wi 

1 


88 


UN-I 

1 


0 


0 


and  D  A 


0 


with,  Ui  A 


0 


G (7e) 


n 
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(7/) 


Defining  f  A  UDh^  6  IR(^+p-«-  the  error  criterion  can  be  written  in  the  following  weighted- quadratic 
form  : 

minb^F^(CC^)-^Fb,  (8) 

b 

where,  F  G  jg  formed  using  the  elements  of  f  as  follows  : 

r  fip)  /(p-i)  •••  m  1 

FA.  ....  (9) 

J{N-\^p-q-2)  fiN+p^q~-3)  f{N-q-2), 

The  optimal  denominator  coefficients  are  obtained  using  an  iterative  algorithm.  Once  the  optimal  denominator 
is  available,  the  numerator  is  estimated  as  : 

a  =  (DiU,+i)#h^,  (10) 


where,  ^  denotes  the  pseudo-inverse  and 


Dft  A 


1  n 

D(^o) 


0 

0 


GIR^^^  and 


U,+i  A 


D^aJjv^ITJ  J 


1 

giwo 

^j2o;o 

1 

1 

gj2a>/sr-.i 

^IRiVX^  +  1^ 


(11) 


(12) 


It  can  be  easily  verified  that  for  the  special  case  of  p  =  g  H-  1,  the  general  criteria  given  here  will  be  exactly  same 
as  the  one  given  in  [14,  28].  It  may  be  emphasized  here  that  the  frequency-domain  LS  algorithms  in  [2,  3]  are, 

(i)  Approximations  or  modifications  of  the  original  criterion  in  (3),  and 

(ii)  {p  H-  g)-dimensional  nonlinear  optimization  problems  for  estimating  a  and  b  simultaneously. 

In  contrast,  the  decoupled  method  (KM-G)  estimate  a  and  b  separately.  But  simulation  experiments  indicate  that 
the  desired  minimum  of  the  criterion  in  (8)  may  not  be  achieved  with  only  an  Evans-Fischl  type  LS  minimization 
of  (8),  Instead,  a  further  step  of  Newton- Raphson  had  to  be  incorporated  in  the  algorithm  in  order  to  achieve  the 
desired  optimum  [14].  Unlike  KM-G,  the  optimally  decoupled  method  developed  in  this  work  reaches  the  desired 
optimum  criterion  more  directly  and  without  using  New  ton- Raphson. 

It  may  be  also  noted  that  Signal  Processing  Toolbox  of  the  widely  popular  MATLAB  software  package 
provides  a  direct  frequency- domain  design  macro  called  yulewalk,  which  basically  implements  a  modified  Yule- 
Walker  method  developed  by  Friedlander  and  Porat  [31].  This  method  does  not  attempt  to  minimize  the  true 
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criterion  in  (3).  Instead,  it  attempts  to  fit  the  deterministic  correlation  values  to  obtain  the  rational  model 
parameters  by  essentially  minimizing  an  equation  error.  The  simulation  section  includes  some  comparison  of  the 
performance  of  the  proposed  method  with  this  approach. 

IV  :  Proposed  Method  (OM-DTFT) 

For  time-domain  rational  model  identification  problems,  a  new  framework  has  been  recently  presented  for  decou¬ 
pling  the  denominator  and  numerator  problems  into  two  separate  but  lower- dimensional  optimization  problems 
[8,  16,  17].  In  this  Section  it  is  shown  that  the  nonlinear  frequency- domain  criterion  of  (3)  can  also  be  decoupled 
in  a  similar  fashion. 

Let  Hi,{z)  be  the  inverse  filter  corresponding  to  D{z)j  f.e., 

D{z)Hi{z)  =  1.  (13a) 

Clearly,  this  is  a  convolution  operation  in  time-domain  and  it  can  be  expressed  using  matrix  notation  as, 

DHj  =  In,  (136) 

where,  In  denotes  a,n  N  x  N  identity  matrix;  D  G  and  G  are  defined  below  in  appropriate 

partitioned  forms  which  will  be  useful  in  the  algorithm  : 
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where,  Bu  G  B  G  H;  G  and  Hr  G  If  the  vector  h,  defined  in 

(2a),  represents  the  finite  length  impulse  response  vector  containing  N  significant  Impulse  Response  values,  the 
frequency  response  at  any  frequency  Wj  will  be  given  as. 


N-l 


ft(n)e-^"’ 


(15) 
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Stacking  the  model  frequency  response  values  at  all  the  N  specified  frequencies,  wo,  wi, 
frequency-domain  vector  can  be  expressed  as  ; 


ho,  A 


r  - 


ri 

1 


1 


g-j(JV-l)wo  - 


e 


h 


A  Wh. 


. . . ,  W7V-1,  the  model 


(16a) 


(166) 

(16c) 


By  definition, 

Hiz)  A  ^  =  Hi{z)N{z),  using  (13a),  (17) 

where,  the  right-hand-side  represents  convolution  of  the  numerator  coefficients  with  the  inverse  sequence,  hj){n) 
(corresponding  to  Hb{z)).  Hence,  it  can  be  shown  that  the  model  impulse  response  vector  h  can  be  expressed  as. 


h  =  H,a. 


(18) 


Using  this  in  (16), 


K  A  WH/a. 

With  these  definitions,  the  frequency-domain  filter  design  problem  in  (3)  can  be  restated  as, 

min||e||2  A  min||h^  -  WH,af . 


(19) 

(20) 


Equation  (20)  is  an  exact  representation  of  the  original  criterion  in  (3),  albeit  in  the  vector-matrix  form.  This 
form  of  the  criterion  explicitly  demonstrates  the  linear  relationship  between  the  fitting  error  e  and  a  and  also  the 
nonlinear  relationship  between  e  and  b  through  the  matrix  H/.  From  this  equation,  it  is  also  apparent  that  this 
is  a  mixed  optimization  problem  where  the  linear  and  nonlinear  variables  appear  separately.  In  order  to  decouple 
the  numerator  and  denominator  estimation  problems,  consider  the  following.  If  H/  (i. e.,  b)  is  known,  then  the 
minimization  of  (20)  will  produce  the  linear  LS  estimate  of  a  as  follows, 

a  A  (WH,)#hi,  (21) 


where,  (WH;)#  A  ((WH;)^(WH)))“^(WHj)^.  In  practice  though,  b  needs  to  be  estimated  also.  Plugging  a 
back  in  (20),  the  optimization  criterion  for  b  is  found  as, 

minllh^-WH,af  =  mm||h^  -  WH,(WH,)#h^f 

=  nun  ||h^  -  PwHi  hf 

=  nun||(Ijv  -  PwH,)hiw||^>  (22) 

where,  PwH,  A  WH/((WH;)^(WH/))“^(WH()^,  denotes  the  projection  matrix  of  (WHi).  Note  that  the 
numerator  and  denominator  estimation  problems  are  now  in  decoupled  forms  in  equations  (21)  and  (22),  respec¬ 
tively.  But  in  (22),  the  parameters  in  b  are  related  to  the  error  criterion  in  a  somewhat  complicated  manner 
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through  PwH,-  Interestingly,  the  operator  (Ijv  -  PwH,)  on  in  (22)  is  the  projection  component  in  that 
is  orthogonal  to  the  subspace  spanned  by  the  columns  of  WH/.  Next  it  is  shown  that  this  orthogonal  space  can 
be  completely  defined  by  the  denominator  coefficients. 


rV.l  :  Reparameterization 

Let  W/  denote  the  inverse  of  the  DTFT  matrix  W,  i.e.,  W/W  =  Ijv-  This  inverse  exists  as  long  as  the 
frequencies  wjt’s  are  distinct.  In  combination  with  (13b), 

DW/WHj  =  In.  (23) 


Use  of  the  partitioned  forms  of  (14)  into  (23)  leads  to. 


[Bn 

’BjW/WH/ 

BjW/WH,] 

B^ 

W/W[H,|H,]  = 

B^W/WH, 

— 

1  B^W/WHr 

1(4+1)  j  0(j+i)x(iV-4-l) 

.0(JV-j-l)x(4+l)  I  I(JV-4-l)x(lV-4-l). 


(24) 


The  bottom-left  corner  element  shows  that  the  TV  x  (TV  -  g  -  1)  matrix  WfB  and  the  TV  x  (g  -f  1)  matrix  WHj 
are  orthogonal,  i.e.,  (B^W/)(WHj)  =  0(^r_}_i)x(«+i)-  By  construction. 


ran jk(WfB)  -I-  ran jfe(WH,)  =  TV. 

(25a) 

Hence,  using  a  property  of  projection  matrices, 

PwJ’B  +  PwHi  =  In- 

(256) 

Using  this  result  in  (22), 

iiiin  ||e4||2  A  niin  UPw^bB^II^ 

b  =  b  * 

(26a) 

=  min  ||Wj’B(B^W/WfB)-^B^W/h5|p 
b 

(266) 

=  min  h^^Wj’B(B^W/Wj’B)-iB^W/h^. 

b 

(26c) 

Note  that  this  reparameterized  criterion  is  directly  related  to  b,  as  desired.  In  order  to  further  simplify  this 
expression,  define  a  vector  z  of  length  TV  as, 

z  A  Wiht  (27) 


such  that  the  criterion  in  (26)  becomes. 


min  z^B(B^W/WfB)-^B^z. 


(28) 


It  can  be  easily  shown  that. 


B^’z  A  Zb, 


(29a) 
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where,  the  matrix  Z  is  constructed  with  the  elements  of  z  as, 


■  z{q  +  1)  z{q)  •  •  •  z(0)  0 

0 

Z  A 

zip)  2(p  -  1)  . 

zil)  ziO) 

_z(Ar-l)  z{N-2)  . 

I 

1 

1 _ 

Using  (29)  in  (28),  the  optimization  criterion  can  be  rewritten  as, 

min  b^Z^(B^W/WfB)“^Zb.  (30) 

b 

Note  that  this  alternate  form  has  a  weighted-quadratic  structure  which  is  convenient  for  minimization.  Equations 
(21)  and  (30)  represent  the  final  decoupled  estimators  to  be  utilized  in  the  algorithm  described  below.  It  should 
be  emphasized  here  that,  thus  far,  the  theoretical  derivations  are  mathematically  exact,  i.e.,  no  linearization, 
approximation  or  modification  have  been  introduced  at  the  outset. 

Regarding  optimality  properties  of  the  decoupled  estimators,  theoretical  results  in  [13]  can  be  used  to  prove 
that  if  b  is  estimated  by  minimizing  the  criterion  in  (30)  and  if  that  estimate  is  utilized  for  computing  a  using 
(21),  then  the  resulting  estimates  are  the  unique  and  global  minimizers  of  the  criterion  in  (20).  The  advantage 
of  estimating  the  linear  and  nonlinear  parameters  independently  is  reduction  in  computational  load  because  the 
iterative  part  is  only  with  respect  to  the  p  coefficients  in  6.  Based  on  the  optimal  b,  estimation  of  the  optimal  a 
is  a  simple  linear  least  squares  problem.  But  more  importantly,  a  needs  to  be  computed  only  once. 

IV.2  :  Algorithm 

The  nonlinear  optimization  criterion  in  (30)  possesses  a  very  useful  matrix  structure.  Specifically,  the  expression 
appears  to  be  a  weighted  quadratic  criterion  in  the  unknown  vector  b.  The  matrices  Z  and  W/  are  known.  But 
the  weight  matrix  (B^W/WjB)“^  itself  is  dependent  on  the  unknowns  in  B.  The  computational  algorithm  will 
utilize  this  weighted  quadratic  structure  of  the  criterion  to  formulate  the  iterations.  Specifically,  the  algorithm 
minimizes  the  following  quadratic  error  criterion  at  fc-th  iteration  step  : 

min  A  min  b^Rib  (31a) 

b  =  b 

where,  is  formed  by  using  the  estimate  of  b  obtained  at  the  previous  iteration  and 

Ri  A  [Z^(B^^*~^^W/W/ B(^"^))“^Z]  is  the  weight-matrix.  An  initial  estimate  of  b  is  necessary  to  start 
the  iterative  process.  b(°)  A  [1  0  •  •  *0]^  can  be  used  or  the  initial  estimates  could  also  be  found  setting  the 
middle  matrix  (B^W/Wf  B)"^  to  identity,  i.e.,  by  optimizing, 

min  b^Z^Zb  A  min  b^R2b  (31^) 

b  ==  b 

where,  the  weight-matrix  R2  A  Z^Z.  In  order  to  ensure  non- trivial  solutions,  the  first  term  of  the  denominator, 
6(0)  is  set  to  unity.  The  computational  algorithm  is  similar  in  nature  to  the  time-domain  counterparts  developed 
recently  [8,  16,  17].  As  outlined  in  the  Appendix,  the  algorithm  has  two  phases.  In  Phase-1,  the  criterion  in  (31a) 
is  minimized  by  neglecting  the  variation  w,r.t.  the  weight  matrix.  Simulation  experience  shows  that  this  Phase 
alone  brings  the  error  quite  close  to  the  minimum.  But  if  necessary,  the  variation  of  the  weight  matrix  may  also 
be  included  by  invoking  Phase-2,  where  the  gradient  of  the  entire  criterion  is  set  to  zero.  Once  the  iterations 
converge,  the  estimated  b  is  used  in  (21)  to  linearly  estimate  the  numerator  coefficient  vector  a. 
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V  :  Simulation  Results 


Two  examples  are  included  to  demonstrate  the  effectiveness  of  the  proposed  algorithm.  The  first  example  considers 
a  Lowpass  filter  design  problem  whereas  the  second  one  designs  a  Notch  filter.  In  all  plots  the  frequency  response 
values  are  displayed  up  to  half  the  sampling  frequency.  For  the  proposed  method,  only  the  Phase- 1  results  are 
given. 

Simulation-1  :  Lowpass  Filter  Design 

Magnitude  response  values  at  56  frequency  points  around  the  unit  circle  were  taken  for  the  matching  purpose.  In 
Fig.  1  the  estimated  response  with  p  =  6  and  g  —  5  for  the  proposed  method  are  shown  by  the  dashed  curve  and 
the  solid  line  represents  the  desired  response.  The  algorithm  converged  in  6  iterations.  For  the  sake  of  comparing 
with  a  widely  used  direct  method,  the  Modified  Yule- Walker  method  [30,  31]  available  in  the  MATLAB  software 
package  was  used  to  design  a  6th  order  filter.  The  magnitude  response  fit  for  this  case  is  shown  as  the  dot-dash 
line  in  Fig.  1. 

Simulation-2  :  Notch  Filter 

A  Notch  Filter  design  problem  was  considered  in  this  case.  The  magnitude  response  values  at  101  frequency 
points  around  the  unit  circle  were  taken.  The  estimated  response  with  10th  order  denominator  and  9th  order 
numerator  as  produced  by  the  proposed  method  as  well  as  the  desired  response  are  shown  in  dB  scale  in  Fig,  2 
in  dashed  and  solid  lines,  respectively.  The  algorithm  converged  in  11  iterations.  The  dash-dot  line  again  shows 
the  fit  when  the  Modified  Yule- Walker  method  [30,  31]  was  used  to  design  the  10th  order  filter. 

Discussion 

The  first  example  has  been  adopted  from  [14,  28].  The  results  presented  above  for  the  proposed  method  did  not 
have  to  make  use  of  any  generic  nonlinear  optimization  technique,  such  as  Newton-Raphson  to  reach  the  final 
optimum.  Also,  during  the  minimization  process,  all  the  coefficients  were  enforced  to  be  real  and  hence  the  filter  is 
readily  realizable.  It  may  also  be  stated  here  that  the  final  designs  were  stabilized  using  the  macro  called  Polystab 
available  in  MATLAB  [29,  30],  where  the  unstable  roots  are  flipped  inside  the  unit  circle.  The  simulations  clearly 
demonstrate  that  the  proposed  method  can  closely  match  arbitrarily  shaped  frequency  response  data  and  it  also 
appears  to  perform  better  than  a  widely  used  method  for  direct  design. 
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Lowpass  Filter  Design 


Fig.  1  .  The  desired  Lowpass  response  is  shown  as  the  solid  line.  The  estimated  responses  using  the  pro¬ 
posed  method  and  the  Yule- Walker  method  are  shown  in  dashed  and  dot-dash  lines,  respectively. 
The  filter  order  is  six. 
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Appendix  :  Computational  Algorithm 


The  criterion  in  (31a)  is  non-linear  in  b  and  hence  it  can  not  be  minimized  directly.  But  instead  of  using  any 
generic  non-linear  optimization  techniques,  the  inherent  mathematical  structure  of  the  criterion  will  be  utilized  to 
develop  an  iterative  computational  algorithm.  The  algorithm  consists  of  two  phases.  In  Phase- 1,  the  variations 
in  the  middle  matrix  (B^W/WfB)  in  (31a)  is  not  taken  into  account  in  the  derivative  calculations,  whereas  in 
Phase-2  the  gradient  of  the  error  norm  in  (31a)  is  set  to  zero. 

Phase- 1 


The  final  form  of  the  error  vector  in  (31a)  is  rewritten  as. 

ej  =  WjB(B^W/WjB)-^Zb 

A  VZb 

(A.l) 

(A.2) 

A  V 

g  5  G 

b 

(A.3) 

=  Vg  -1-  VGb, 

(AA) 

where, 

V  A  WfB(B' 

'’W/WjB)-!, 

(A.5) 

and 

b  A  [6(1)  6(2)  ...  b(p)f. 

(A.6) 

If  the  matrix  V  is  treated  as  independent  of  b,  an  expression  for  b  can  be  easily  obtained  by  minimizing  ||e6|p 
w.r.i.  h  as  follows  : 


b  =  -  (VG)^Vg 

=  -  (G^’V^VG)"^  G^V^Vg.  {A.7) 

But  since  V  does  depend  on  the  elements  in  b,  (A.7)  can  only  be  computed  iteratively.  At  the  {i  +  l)-th  step 
of  iteration,  is  formed  using  the  estimate  of  b  found  in  the  i-th  iteration  step.  This  leads  to  the  following 
iterative  algorithm  for  computing  b*"*"^  : 


b(*+i) 


1 

_[F(‘)G]-i[FW]g 


(A.8) 


where, 

A  (A.9) 

The  iterations  are  continued  until  ||b,+i  -  b.  jp  <  6,  where  6  is  an  arbitrarily  small  number.  It  must  be  noted 
here  that  the  iterations  in  (A.8)  may  not  always  converge  to  the  absolute  minimum  of  the  error  criterion  in  (31a) 
and  hence  the  estimated  b  may  not  be  the  optimum  one.  This  is  because  in  (A.8)  the  variability  of  V  w.r.t.  b 
had  been  ignored  while  minimizing  l|e|p.  To  achieve  the  optimum,  the  gradient  of  the  complete  expression  of 
||elp  must  be  set  to  zero.  If  desired,  this  can  be  done  in  Phase-2  of  the  algorithm  which  is  outlined  next.  It  may 
be  noted  here  that  the  simulation  studies  indicate  that  the  Phase- 1  of  iterations  using  (A.8)  perform  an  excellent 
job  of  bringing  the  estimate  very  close  to  the  optimum.  Once  the  estimates  of  b  converge,  a  is  computed  using 
(21). 
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Phase-2 


In  this  phase,  the  derivative  of  the  matrix  V  w,r.t  b  is  taken  into  consideration  while  minimizing  the  fitting 
error  norm.  By  setting  the  derivative  of  the  squared  norm  in  (31a)  to  zero,  we  obtain  the  updated  at  the 

{i  +  l)“th  iteration  as, 


b(‘+i)  =  -  [sWG]-‘[S(‘>]g 

(A.IO) 

where  (suppressing  the  superscript  (*)), 

S  A  L^V  +  G^V^V, 

(Alla) 

L 

A  r  dv  ■ 

=  [56(1)^^  "'db{p)^^y 

(A116) 

This  minimization  phase  continues  until  b*  is  reached  and  this  optimum  b  vector  corresponds  to  a 

minimum  of  the  error  surface  of 
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Section  -  3.3  :  OPTIMAL  ESTIMATION  OF  THE  PARAMETERS  OF  ALL-POLE  TRANSFER 

Functions 

Summary 

An  algorithm  is  proposed  for  optimal  estimation  of  the  parameters  of  Auto- Regressive  (AR)  or  all-pole 
transfer  function  models  from  prescribed  impulse  response  data.  The  transfer  function  coefficients  are  estimated 
by  minimizing  the  ^2-norm  of  the  exact  model  fitting  error.  Existing  methods  either  minimize  equation  errors  or 
modify  the  true  non-linear  fitting  error  criterion.  In  the  proposed  method,  the  multidimensional  nonlinear  error 
criterion  has  been  decoupled  into  a  purely  linear  and  a  nonlinear  subproblem.  Global  optimality  properties  of  the 
decoupled  estimators  have  been  established.  For  data  corrupted  with  Gaussianly  distributed  noise,  the  proposed 
method  produces  Maximum-Likelihood  Estimates  (MLE)  of  the  AR-parameters.  The  inherent  mathematical 
structure  in  the  non-linear  subproblem  is  exploited  in  formulating  an  efficient  iterative  computational  algorithm  for 
its  minimization.  The  proposed  algorithm  provides  an  useful  computational  tool  based  on  appropriate  theoretical 
foundation  for  accurate  modeling  of  all-pole  systems  from  prescribed  impulse  response  data.  The  effectiveness  of 
the  algorithm  has  been  demonstrated  with  several  simulation  examples. 

1.  Introduction 

Parameter  estimation  of  unknown  discrete-time  linear  systems  is  a  fundamental  problem  in  digital  signal 
processing.  Parametric  models  overcome  the  infinite  dimensionality  problem  of  non-parametric  models  with 
parsimonious  representation  of  systems  in  terms  of  only  a  finite  number  of  parameters.  Over  the  last  few  decades 
these  problems  have  been  addressed  in  a  large  body  of  work  in  many  different  fields  [1-17,  22,  24-38,  40-47]. 
Among  many  parametric  models  used  in  signal  processing,  Auto- Regressive  (AR)  or  all-pole  model  is  one  of  the 
most  effective  and  practical  representations. 

The  AR-parameter  identification  problem  arises  both  in  stochastic  and  deterministic  time-series  analysis. 
There  are  probably  two  primary  reasons  for  the  wide  popularity  of  AR  modeling  in  statistical  time  series  analysis. 
Firstly,  according  to  Kolmogorov  Theorem,  any  minimum  phase  transfer  function  H{z)  can  be  represented  by  a 
possibly  infinite  order,  stable  minimum  phase  AR-model  [9,  10].  This  important  theorem  implies  that  even  if  an 
AR  model  is  picked  erroneously,  the  unknown  Power  Spectral  Density  can  still  be  matched  closely  as  long  as  a 
‘large  enough’  AR  model  order  is  chosen.  But  the  second  and  the  main  reason  for  the  popularity  of  AR-modeling 
is  that  it  is  possible  to  obtain  reasonably  good  subopiimal  estimates  of  the  unknown  AR-parameters  by  solving  a 
simultaneous  set  of  linear  equations. 

Modeling  human  vocal-tracts  as  all-pole  systems  and  the  corresponding  Speech  signal  as  AR-process  is  one 
of  the  most  important  applications  of  AR-modeling  [4,  8].  Furthermore,  two  important  modeling  philosophies, 
viz,,  Linear  Prediction  (LP)  and  Maximum  Entropy  methods  essentially  produce  the  AR-parameters  as  their 
estimates,  regardless  of  the  true  underlying  signal  model. 

This  Section  deals  with  the  problem  of  estimating  the  parameters  of  an  all-pole  transfer  function  to  match 
a  prescribed  or  desired  impulse  response  specification.  The  least-squares  Impulse  Response  (IR)  model  fitting 
error  has  been  chosen  as  the  objective  optimality  criterion.  Many  well-known  techniques  developed  for  statistical 
time-series  analysis  have  been  used  successfully  in  the  deterministic  case  also  [7,  10].  AR-model  fitting  may 
be  considered  a  special  case  of  estimating  the  unknown  parameters  of  general  ARM  A  (or  pole-zero)  models. 
ARMA  parameter  estimation  is  known  to  be  a  multidimensional  nonlinear  optimization  problem  and  there  have 
been  extensive  work  on  this  subject  [1-7,  10,  12,  13,  15,  16,  24-31,  34-37,  41-43,  47].  In  one  of  the  earlier 
works,  Kalman  [1]  had  proposed  a  linearized  and  approximate  ‘equation  error’  minimization  technique  which 
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produces  suboptimal  estimates.  Several  two-step  procedures  where  the  denominator  and  numerator  polynomials 
are  estimated  separately,  have  also  been  proposed  [2,  30].  In  these  methods,  the  denominator  is  first  estimated 
by  minimizing  an  ‘equation  error’  and  then  the  numerator  is  found  by  minimizing  a  linearized  ‘fitting  error’  [2] 
or  by  setting  the  leading  error  samples  to  zeros  [30].  A  thorough  coverage  on  filter  design  by  modeling  may  be 
found  in  [7]. 

Equation  error  minimization  is  a  commonly  used  optimization  procedure  for  estimating  AR-parameters. 
In  fact,  the  well-known  linear  prediction  (LP)  coefficients  [7-10]  are  estimated  by  minimizing  equation  errors. 
Linear  predictors  have  extensive  usefulness  in  speech  analysis,  synthesis  and  coding.  Many  practical  and  efficient 
algorithms  are  available  for  LP  parameter  estimation.  Among  these,  the  ‘Autocorrelation  Method’  (AM)  and 
the  ‘Covariance  Method’  (CM)  are  most  popular.  CM  and  AM  do  not  produce  optimal  estimates  in  the  sense 
that  the  modtl  fitting  errornorm  is  not  minimized  in  either  case.  In  contrast,  Steiglitz  and  McBride  (SMM)  had 
proposed  a  modified  fitting  error  minimization  criterion  for  estimating  the  coefficients  of  general  ARMA  models 
[3,  4,  7].  SMM  has  also  been  adapted  for  AR  parameter  estimation  [7].  In  absence  of  any  exact  model  fitting  error 
criterion,  SMM  has  established  itself  as  the  standard  method  for  AR  and  ARMA  parameter  estimation  problems 
[3,  4,  7,  10,  12,  22,  25,  34,  40,  47].  In  [5],  a  decoupled  exact  fitting  error  minimization  approach  has  also  been 
proposed  by  Evans  and  Fischl  (EFM).  But  their  algorithm  is  applicable  only  in  the  case  of  strictly  proper  ARMA 
models  where  the  number  of  poles  must  be  exactly  one  more  than  the  number  of  zeros.  Consequently,  the  optimal 
EFM  can  be  applicable  for  identifying  first-order  AR-models  only.  The  proposed  optimal  algorithm  has  no  such 
restrictions. 

The  proposed  algorithm  originates  from  a  recently  developed  optimal  method  (OM)  for  general  ARMA 
modeling  [6].  Unlike  EFM,  the  decoupled  fitting  error  minimization  approach  in  [6]  is  applicable  for  ARMA 
models  with  arbitrary  numbers  of  poles  and  zeros.  Furthermore,  in  contrast  to  the  methods  in  [1-4,  24,  30],  no 
linearization  or  modification  of  error  criterion  is  introduced  in  the  theoretical  derivation  of  the  least-squares  model 
fitting  criterion.  In  this  Section,  the  complete  derivation  of  the  optimal  solution  for  the  AR  case  (OM-AR)  is  being 
presented  for  the  first  time.  It  is  also  shown  that  if  the  observation  data  is  composed  of  true  impulse  response 
corrupted  by  Gaussianly  distributed  noise,  then  the  proposed  optimization  produces  the  Maximum-Likelihood 
estimates  (MLE)  of  the  AR  parameters.  For  other  types  of  noise  or  deviations  least-squares  estimates  (LSE)  are 
found. 

A  critical  step  in  the  theoretical  derivation  of  the  error  criterion  is  to  decouple  the  multidimensional  criterion 
into  a  non-linear  problem  for  the  AR-parameters  and  a  linear  problem  for  the  numerator  coefficient.  In  the 
decoupled  form,  the  fitting  error  is  found  to  be  related  to  an  equation  error  which  is  different  than  the  ones  that 
appear  in  CM  or  AM.  But  the  form  of  the  equation  error  is  shown  to  be  mathematically  appropriate  for  the  AR 
case.  The  non-linear  criterion  possesses  inherent  matrix  prefiltering  structure  which  directly  leads  to  formulating 
an  efficient  iterative  computational  algorithm  for  its  minimization.  Several  simulation  examples  demonstrate  the 
superior  performance  of  the  proposed  approach  when  compared  to  some  of  the  existing  suboptimal  methods. 

The  Section  is  arranged  as  follows  :  in  Subsection  II,  the  problem  is  defined,  the  connection  with  MLE  is 
established  and  some  existing  results  are  briefly  outlined.  In  Subsection  III  the  error  criterion  is  theoretically 
derived  for  the  AR  case  and  the  computational  algorithm  is  presented.  In  Subsection  IV,  several  simulation 
examples  are  given.  Finally,  in  Subsection  V,  some  concluding  remarks  are  given. 

II.  Problem  Statement  and  Previous  Results 

The  ^-domain  transfer  function  for  an  auto-regressive  model  can  be  represented  as. 


where  the  coefficient  of  the  term  in  the  denominator  has  been  assumed  to  be  unity  without  any  loss  of  generality. 
As  an  example,  ff(z)  may  represent  the  transfer  function  of  human  vocal  tract  which  is  commonly  modeled  as 
an  alhpole  model.  The  model  order  p  is  assumed  to  be  known.  In  case  of  speech  signals,  for  example,  a  lot 
of  experience  and  knowledge  is  already  available  and  the  value  of  p  =  10  or  8  is  usually  chosen.  An  equivalent 
representation  of  the  transfer  function  H{z)  can  also  be  written  in  terms  of  its  impulse  response  as, 

H{z)  =  ft(0)  +  hil)z-^  +  •  •  •  +  -  2)z-(^-2)  +  hiN  -  +  •  •  • .  (2) 

The  first  N  significant  samples  of  H{z)  can  be  stacked  in  a  vector  form  as, 

h  A  [h{0)  h{l)  ■■■  hiN-  l)f .  (3) 

Next,  the  vector  containing  the  N  samples  of  the  ‘prescribed’  or  ‘desired’  impulse  response  data  is  denoted  as, 

hp  A  [MO)  Ap(l)  •••  M^-1)]’’-  (4) 

The  desired  IR  data  vector  may  represent  the  impulse  response  of  vocal  tract.  With  these  definitions,  the  problem 
addressed  in  this  Section  may  be  stated  as  follows  : 

Given  a  desired  impulse  response  hp,  the  goal  is  to  obtain  the  optimal  estimates  of  the  model  parameters 
no  and  d  by  minimizing  the  following  least-squares  IR  model-fitting  criterion  : 


/  1,  i  =  0 

|o,  i#0. 

e  A  hp  —  h  and 


where. 


d  A  [1  di  . .  •  dpf  . 


The  notation,  denotes  the  response  of  the  system,  when  driven  by  an  input  sequence,  6{i). 

Clearly,  the  criterion  in  (5)  attempts  to  minimize  the  squared  error  between  the  desired  and  the  estimated  IR  and 
hence,  it  can  be  expected  to  produce  more  accurate  model  than  some  well-known  AR  modeling  methods  (outlined 
below)  which  only  minimize  ‘equation  errors’.  The  least-squares  problem  in  (5)  is  known  to  be  nonlinear  in  d  and 
standard  nonlinear  optimization  algorithms  have  been  utilized  before  in  [15,  25-29,  36].  It  should  be  emphasized 
that  if  the  given  IR- vector  hp  is  composed  of  the  true  IR- vector  h  and  additive  Gaussianly  distributed  noise 
or  deviations  then  the  minimization  criterion  in  (5)  is  exactly  equivalent  to  the  maximization  of  the  Likelihood 
criterion  [see  ref.  10,  pp.  242-248].  Hence,  for  such  a  scenario  the  algorithm  proposed  in  this  Section  produces  the 
MLE  of  the  AR-parameters.  For  all  other  types  of  noise  and  deviations  the  Least-Squares  Estimates  are  found. 
It  may  also  be  noted  that  the  MLE  result  in  [10]  is  primarily  based  on  the  works  in  [5,  13,  44]  where  only  the 
strictly  proper  ARMA  case  was  considered.  The  MLE  for  transfer  functions  with  arbitrary  number  of  poles  and 
zeros  has  been  presented  recently  in  [6]. 

In  many  applications,  such  as  in  linear  prediction  of  speech  signals  [4,  8],  only  the  estimation  of  the  AR- 
parameters  is  of  primary  concern.  The  two  most  commonly  used  LP  algorithms,  AM  and  CM,  do  not  solve  the 
ideal  problem  stated  in  (5)  whereas  SMM  attempts  to  solve  the  ideal  problem  by  appropriate  modification  of  the 
criterion  in  (5)..  These  three  approaches  are  briefly  summarized  next. 
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Covariance  Method  [CM] 

The  ^2-norin  of  the  following  equation  error  is  minimized  [7-10]  : 

hp{p)  hp{p-l)  ■■■  hp(0) 

Jip{N-l)  hp{N-2)  •••  hp{N-p-l) 

or,  HcA/d  A  (66) 

Note  that  Hcm  is  filled  with  available  IR  data  only  and  hence  ef^^  may  be  considered  an  ‘exact’  equation  error. 
Auto-correlation  Method  [AM] 

In  this  case,  the  f 2-norm  of  the  following  equation  error  is  minimized  [7-10]  : 

■  hp(0) 

hp{l) 

hp{p) 

hp{N  -  1) 

'  0 

0 

or,  H^Afd  A  .  (76) 

The  zeros  in  the  upper  and  lower  triangles  of  Ham  are  not  part  of  the  prescribed  IR-vector  hp  and  hence 
is  not  an  exact  equation  error. 

It  can  be  observed  from  (6)  and  (7)  that  the  equation  error  for  CM  uses  windowed  data  without  making 
any  prior  assumptions  about  the  data  outside  the  observed  window  {/ip(0)  . . .  hp(^N  —  1)}.  On  the  other  hand, 
AM  uses  unwindowed  data  but  sets  the  data  outside  the  observation  frame  to  zero.  Because  of  this  reason,  AM 
usually  produces  less  accurate  estimates  than  CM.  But  it  should  also  be  noted  that  even  though  is  not  an 
exact  equation  error,  one  of  the  significant  advantages  of  using  AM  is  that  the  computationally  efficient  Levinson- 
Recursion  algorithm  can  be  utilized.  In  case  of  CM,  a  somewhat  less  efficient  algorithm,  Cholesky  decomposition 
can  be  used  [7-10].  Furthermore,  the  AR  coefficients  obtained  by  minimizing  the  norm  of  produce  a  stable 
transfer  function. 


0 

hp(0) 
hp{p  -  1) 

hp(N  -  2) 

hpiN  -  1) 

0 


0 

0 

hp{0) 

hp(N-p-l) 

hp{N-p-2) 

hpiN-1)  J 


1 

di 

LrfpJ 


=  e 


AM 
eq  ) 


(7a) 


1  1 

di 

I  LdpJ 


=  e, 


CM 
eq  J 


(6o) 


Steiglitz-McBride  Method  [SMM] 


This  method  was  originally  developed  for  general  ARMA  parameter  identification  but  it  has  also  been  adapted 
for  AR  parameter  identification.  For  the  AR  case,  the  following  modified  fitting  error  criterion  is  optimized  [7], 


min 

no,<i 


(8) 


The  estimate  D{z)  obtained  at  any  iteration  step  is  used  as  a  prefilter  for  obtaining  the  updated  estimates  at  the 
succeeding  iteration.  Equation  (8)  closely  approximates  the  criterion  in  (5)  and  both  are  identical  if  D{z)  =  D{z). 
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But  the  advantage  of  using  (8)  is  that  the  unknown  parameters  in  d  and  no  can  be  estimated  by  solving  a  set  of 
simultaneous  linear  equations.  It  may  be  noted  here  that  in  [7],  the  numerator  coefficient  no  had  been  assumed 
to  be  unity  but,  in  general,  that  may  not  be  the  case.  The  derivation  of  the  proposed  fitting  error  optimization 
scheme  is  in  order. 

III.  Problem  formulation  and  algorithm  development 

In  this  Subsection,  the  multidimensional  optimization  problem  in  (5)  is  decoupled  into  a  linear  estimation 
problem  for  no  and  a  non-linear  optimization  problem  for  d.  Let  Hd(z)  be  the  inverse  filter  corresponding  to 
D{z),  i.e., 

D{z)Hd{z)  =  1.  (9) 

In  time  domain,  this  corresponds  to  a  convolution  operation  where  the  djfc’s  are  finite  and  the  are  infinite 

in  extent.  The  first  N  significant  terms  of  this  convolution  operation  may  be  expressed  in  matrix  notation  as. 


DHd  =  In 


where.  In  denotes  an  N  x  N  identity  matrix, 

r  1  0 

di  1 

dp  dp — r 


D  A 


LO 


Hd  A 


•••  0  0l 

■  0  0 

1  •••0 


/id(0)  0 

hail)  ha(0) 


€JR' 


NxN 


and 


haiN-1)  haiN-2)  •••  hd(0) 


e  IR. 


NxN 


(10) 


(11a) 


(116) 


Using  (9),  the  expression  in  (1)  can  be  rewritten  as, 

Hiz)  =  ^  t 

Equating  the  first  N  coefficients  of  equal  powers  of  z~^  in  both  sides  of  (12)  and  using  vector  notation, 


h  A  nohd, 

where,  hd  is  also  the  first  column  of  defined  in  (11b),  e.e., 

ha  A  MO)  hail)  •••  haiN-l)f. 
With  these  definitions,  the  problem  stated  in  (5)  can  be  rephrased  as, 

min||e|p  A  min||hp  —  nohd||^, 

no,d  =  no.d 


where,  the  error  vector  is  defined  as. 


e  A  hp  —  nohrf. 


(12) 

(13) 

(14) 

(15) 

(16) 
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It  is  clear  from  (16)  that  the  error  e  is  linearly  related  to  no  whereas  e  is  non-linearly  related  to  d  through  the 
vector  h(j.  In  this  form,  it  is  apparent  that  the  present  problem  belongs  to  a  class  of  mixed  optimization  problems 
where  the  linear  and  nonlinear  variables  appear  separately.  This  class  of  problems  has  been  studied  extensively 
by  numerical  analysts  [18-21].  In  their  work,  the  main  objective  had  been  to  optimize  the  two  sets  of  variables 
independently.  Their  argument  goes  as  follows.  If  hj  {i.e.,  d)  is  known,  then  no  can  be  optimally  estimated  by 
the  minimization  of  the  criterion  in  (15)  and  the  resulting  least-squares  estimate  will  be  given  by, 

ho  A  hfhp,  (17) 

where  *  denotes  pseudo-inverse  operation  defined  as,  hf  A  (hjhd)“‘hj.  In  practice,  d  will  not  be  known  and 
it  has  to  be  estimated.  Plugging  ho  back  in  (15),  the  optimization  criterion  for  d  can  be  found  as, 

min||hp  -  nohdjp  =  min||hp  -  (hdhf)hp||^  (18a) 

no,d  d 

=  niin||(Ijv  -  Phd)hp||^,  (IS^*) 

d 

where  Ph^  denotes  projection  matrix  defined  as,  Ph^  A  hd(hjhd)“^hj.  For  a  larger  class  of  multidimensional 
nonlinear  optimization  problems,  it  has  been  proved  in  Theorem  2.1  of  [18]  that  if  d  is  estimated  by  minimizing 
the  criterion  in  (18)  and  if  that  estimate  is  utilized  for  computing  ho  using  (17),  then  the  resulting  estimates 
are  the  unique  and  global  minimizers  of  the  criterion  in  (15).  Hence,  the  original  optimization  problem  in  (5) 
is  identical  to  the  decoupled  estimators  in  (17)  and  (18).  This  type  of  decoupled  optimization  of  linear  and 
non-linear  subproblems  had  been  utilized  before  in  [5,  13,  45,  46]  for  strictly  proper  ARMA  case  and  in  [6]  for 
the  general  ARh^A  case.  The  derivation  for  the  AR  case,  as  given  here,  appears  to  be  new. 

The  AR-parameters  in  d  are  related  to  the  error  criterion  in  a  complicated  manner  through  Phi-  Hence,  the 
direct  optimization  of  (18)  w.r.i,  d  would  require  taking  resort  to  standard  non-linear  optimization  techniques 
such  as  Newton-Raphson  or  Gauss-Newton  methods.  Instead,  following  the  strategy  used  in  [6]  for  the  general 
ARMA  case,  the  criterion  in  (18)  is  reparameterized  by  relating  it  directly  to  the  coefficients  in  d.  Appropriate 
partitioning  of  the  matrices  D  and  gives, 

1  0  •••  •••  0  0 

di  1  .  0  0 

dp  dp-i  ■■■  1  •••  0 
0  •  •  •  dp  .  1 


■  dj  ■ 

A -  and  (19a) 

=  [  _ 


r  MO)  I  0  •••  0-1 

Ml)  I  ha{0)  •  0 

Hd  A  :  I  :  :  A  hj  : 

~  M^-1)  I  hi{N-2)  •••  MO) 


(196) 


Using  these  notations,  the  expression  in  (10)  can  be  rewritten  as. 


(19c) 

(19d) 
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The  bottom-left  corner  element  shows  that  the  N  x  {N  -  \)  matrix  B  and  the  vector  are  orthogonal,  i.e,, 
B^hd  =  0.  Also  by  construction, 

rank{B)  -h  rank{hd)  =  N.  (20) 

Hence,  using  a  theorem  on  projection  matrices  [39], 

Pb  +  Ph.  =  In  (21) 

Using  this  relationship  in  (18b),  the  following  equivalent  forms  of  reparameterized  optimization  criterion  are 
obtained. 


where,  Beq  is  an  equation  error  defined  as. 
This  equation  error  can  also  be  rewritten  as, 

r 


min||PBhp||^, 

d 

(22a) 

min||B(B’’B)-^B^hp|p, 

d 

(226) 

min  ||B(B’^B)"^eej||^, 
d 

(22c) 

mineJ,(B^B)"‘eej, 

(22d) 

B^hp. 

(23) 

Bea  =  B^hp  = 


/lp(0)  •••  0 

ftp(;\^-l)  hpiN-2)  •••  hp{N-p-l) 

A 


_ 1 

1 _ 

(24a) 

(246) 


A  few  observations  may  be  made  here  regarding  the  equation  error  defined  in  (23).  Clearly,  Oej  differs  from  the 
equation  errors  used  in  CM  and  AM  as  defined  in  (6)  and  (7),  respectively.  The  equation  errors  in  those  cases 
were  formed  in  somewhat  ad  hoc  manner  on  the  basis  of  two  types  of  autocorrelation  estimates  [7,  9,  10].  On 
the  other  hand,  the  particular  form  of  equation  error  in  (24a)  resulted  from  purely  mathematical  consequences 
of  the  AR  case  under  consideration.  In  particular,  if  the  prescribed  response  hp  happens  to  be  an  exact  impulse 
response  of  a  p-th  order  AR  transfer  function,  then  the  equation  error  in  (24a)  will  be  identically  equal  to  zero, 
but  the  same  will  not  be  true  for  in  (7).  The  equation  error  for  CM  appearing  in  (6)  will  also  be  zero  but 
ignores  the  information  contained  in  the  upper  (p  -  1)  equations  of  (24a).  From  this  discussion,  it  can  be 
concluded  that  more  accurate  estimates  may  be  obtained  if  the  equation  error  in  (23)  is  used  for  the  AR  case. 
Minimization  of  this  equation  error  will  be  utilized  later  in  the  computational  algorithm  for  obtaining  the  initial 
estimate  of  d. 

Using  (24b)  in  (22d),  the  reparameterized  criterion  can  be  expressed  in  the  following  useful  form, 

mind’’H5p(B^B)-^HARd.  (25) 

d 

According  to  Theorem  2.1  of  [18],  the  denominator  vector  d  causing  the  minimum  of  the  criterion  in  (25)  is  the 
desired  optimum  d^.  The  minimized  error  can  then  be  found  from. 


=  PboHp, 


(26) 
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where,  B®  is  constructed  by  using  the  optimum  d®.  The  optimum  estimate  of  the  impulse  response  is  then, 

h®  =  hp  -  e®.  (27) 


From  [18],  it  can  be  also  be  inferred  that  if  is  formed  using  d®  then  the  optimal  ng  can  be  obtained  using  (17) 
as, 

n®  A  h3’*‘hp.  (28a) 

Next  it  is  shown  that  instead  of  using  (28a),  the  optimal  numerator  coefficient  can  be  found  in  a  more  straight¬ 
forward  manner  as, 

ng  =  h%0),  (286) 

where  /»®(0)  is  the  first  sample  of  h®,  the  optimal  impulse  response  estimate  found  in  (27).  In  order  to  demonstrate 
this  equivalence,  equation  (28b)  is  rewritten  as, 

nS  =  hf  d®,  (29a) 


where, 

hi  A  [h®(0)  0  •  •  •  0]^.  (296) 

Note  that  the  first  term  in  d  is  always  1.  Using  the  partitioning  notation  in  (19a),  equation  (29a)  can  also  be 
rewritten  as, 


ng  =  dJh^  (30a) 

=  dl{hp  -  e"^),  using  (27),  (306) 

=  d^(hp  -  hp  +  h^h^^hp),  using  (18a)  and  (26)  (30c) 

=  (d^h^rf)hf  hp,  (30d) 

=  hfhp,  (30e) 


where,  the  last  equality  uses  the  fact  that,  d^h^d  =  1,  which  appears  in  the  upper-left  partition  of  (19c).  This 
completes  the  proof  of  equivalence  between  the  expressions  in  (28a)  and  (28b).  It  should  be  noted  that  (29)  may 
be  preferable  over  (28a)  for  computing  no  because  the  computation  of  h^  and  the  pseudo-inverse  solution  required 
in  (28a)  can  be  avoided,  whereas  calculation  of  the  optimal  h®  in  (27)  may  be  a  necessary  step.  Equations  (22)  and 
(29)  are  the  two  desired  decoupled  formulae  for  estimation  of  the  coefficients  of  the  denominator  and  numerator 
polynomials  of  the  AR-model.  It  should  be  mentioned  that  unlike  the  decoupled  forms  of  SMM  given  in  [7]  and 
[34],  no  approximations  were  introduced  in  deriving  the  decoupled  estimators  in  (22)  and  (29).  A  computational 
algorithm  for  minimization  of  the  criterion  in  (22)  is  outlined  next. 

Computational  Algorithm 

The  criterion  in  (22)  is  non-linear  in  d  and  hence  it  can  not  be  minimized  directly.  Standard  gradient-based 
non-linear  optimization  techniques  such  as  Newton-Raphson  or  Gauss-Newton  algorithms  could  be  used.  But 
these  algorithms  utilize  only  the  first  few  terms  of  Taylor  series  and  are  known  to  be  highly  sensitive  to  the  choice 
of  the  initial  estimates.  But  it  can  be  observed  from  (25)  that  the  error  criterion  possesses  a  good  deal  of  matrix 
structure.  Specifically,  the  expression  appears  to  be  a  weighted  quadratic  criterion  in  d,  except  that  the  weight 
matrix  (B^B)“^  itself  is  dependent  on  the  unknowns  in  d.  This  inherent  mathematical  structure  of  the  criterion 
will  be  utilized  to  develop  an  iterative  computational  algorithm.  The  algorithm  is  similar  to  the  ones  for  ARMA 
cases  appearing  in  [5,  6,  13].  Here  the  complete  derivation  for  the  AR  case  will  be  given. 
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In  order  to  initiate  the  iterative  algorithm,  d  is  first  estimated  by  minimizing  the  4-norm  of  the  equation 
error  Geq  defined  in  (24a).  Partitioning  the  equation  error  can  be  rewritten  as  follows, 


Cej  =  A 


r  hp{l)  I  hp{0)  0 

;  I  :  •••  : 

hp{N-l)  I  hp(N-2)  •••  hp{N-p-l) 

g  :  g]  d. 


■  1  - 

di 

d2 

- 

1 

•• 

1 _ 

(31a) 


(316) 


Minimizing  ||e  eg||2  w.r.t.  d,  the  following  initial  estimate  is  obtained 

1 


-G#g 


(32) 


This  estimate  will  be  utilized  for  initiating  the  iterative  computational  algorithm.  The  final  form  of  the  error 
vector  in  (22a)  is  rewritten  as, 


where. 


e  =  B(B^B)-^ 

B^hp 

(33a) 

A  WB^’hp 

using  (24), 

(336) 

=  WH^fid 

using  (31b), 

(33c) 

=  W 

g ;  G 

d 

(33d) 

=  Wg  +  WGd, 

(33e) 

W  A  B(B’’B)- 

and, 

(33/) 

d  A  [d 

1  ^2 

...  dpf. 

(335) 

If  the  matrix  W  is  treated  as  independent  of  d,  minimization  of  ||e|p  w.r.t  d  results  in  the  following  estimate  : 

d  =  -  (WG)^  Wg 

=  -  (G^W^WG)"‘G^W^Wg.  (34) 

But  W  does  have  dependence  on  the  elements  in  d  and  hence  the  estimate  in  (34)  can  only  be  computed  iteratively. 
The  estimate  of  d  found  in  the  i-th  iteration  step  is  used  in  (33f)  to  form  which  is  then  utilized  at  the 

(i  -I-  l)-th  step  of  iteration  to  compute  d‘+^  as  follows  : 


d(*+i)  = 


[-[X(‘)G]-i[X«]gJ 


where, 


=  g^(b^^’^b(‘'))-^ 
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(35a) 


(356) 

(35c) 


The  iterations  are  continued  until  —  d,'|p  <  S,  where  6  is  an  arbitrarily  small  number. 

It  must  be  noted  here  that  the  iterations  in  (35)  may  not  always  converge  to  the  absolute  minimum  of 
the  error  criterion  in  (5)  and  hence  the  estimated  d  may  not  be  the  optimum  one.  This  is  because  in  (35)  the 
variability  of  W  w.r.t  d  has  been  ignored  while  minimizing  |le||^.  To  achieve  the  optimum,  the  gradient  of  the 
complete  expression  of  ||e|p  must  be  set  to  zero.  If  desired,  this  may  be  done  in  a  second  phase  of  the  algorithm 
which  is  outlined  in  the  Appendix.  Invoking  Phase-2  will  assure  that  at  least  a  local  minimum  will  be  achieved. 
But  it  may  be  noted  here  that  the  simulation  studies  indicate  that  the  Phase-1  of  iterations  using  (35)  does 
an  excellent  job  of  bringing  the  estimate  very  close  to  the  optimum.  It  will  be  shown  in  Subsection  IV  that 
the  Phase-2,  if  invoked,  causes  almost  insignificant  changes  in  the  d  vector  and  the  minimized  error  norm.  In 
simulations,  the  convergence  was  found  to  be  quite  rapid  in  both  the  phases.  Once  the  estimates  of  d  converge, 
no  is  found  using  (26),  (27)  and  (29),  in  sequence. 

Discussion 

The  major  computational  burden  of  the  algorithm  is  in  performing  the  iterative  refinement  in  (35),  where, 
at  each  iteration  step  an  {N  -  1)  matrix  (B''’B)  needs  to  be  inverted.  But  (B^B)  is  a  banded  and 

symmetric  matrix  which  can  be  inverted  using  computationally  efficient  Cholesky  decomposition  [8,  10].  Further 
reduction  in  computation  is  also  possible  because  though  (B^B)  is  not  purely  Toeplitz,  a  major  {N-p)  x{N-p) 
diagonal  block  is  symmetric-banded-Toeplitz  and  this  block  can  be  inverted  with  0[(A  -  p)  log(7V  -  p)]  +  0[p^] 
operations  [23].  The  other  (p  -  1)  x  (p-  1)  diagonal  block  is  symmetric  and  can  be  inverted  with  0[(p-  1)^] 
operations.  Furthermore,  the  non-diagonal  blocks  contain  mostly  zero  elements.  Hence,  using  the  block  matrix 
inversion  formula  due  to  Schur  [48],  this  matrix  inversion  can  be  computed  with  less  than  0[(JV  —  1)^]  operations. 
It  may  also  be  noted  that  in  case  of  SMM  the  calculation  of  IR  of  the  inverse  filter  and  data  filtering  are  required 
at  every  step  of  iteration,  whereas  the  proposed  method  uses  the  estimated  d  directly  to  form  the  B  matrix. 

The  LS  error  criterion  defined  in  (5)  attempts  to  match  only  the  first  N  available  samples  of  hp{n).  No  explicit 
assumption  has  been  made  about  the  unobserved  samples,  but  the  estimated  rational  transfer  function  essentially 
extends  the  impulse  response  beyond  the  observations.  It  may  be  noted  here  that  minimum  phase  property  can 
not  be  guaranteed  with  the  AR-parameter  estimates  produced  by  the  proposed  algorithm.  Extensive  simulation 
studies  indicate  though  that  with  converging  IR  sequences,  the  algorithms  always  produced  stable  solutions.  It 
should  also  be  pointed  out  that  among  existing  methods,  only  the  autocorrelation  method  can  guarantee  stable 
solutions.  But  AM  uses  windowed  data  and  the  IR  fit  with  the  estimates  is  usually  not  very  accurate  because  the 
original  least-squares  IR  error  criterion  is  not  minimized.  To  ensure  minimum  phase  solution,  AM  can  be  used 
(instead  of  (32))  to  obtain  the  initial  estimates  for  starting  the  iterative  AR-algorithm.  If  the  estimates  obtained 
from  the  iterative  scheme  becomes  maximum  phase  at  any  iteration  step  of  the  AR-algorithm,  the  iterations  can 
be  terminated  at  that  stage.  The  estimate  found  at  the  preceding  iteration  should  be  accepted  as  the  best  possible 
minimum  phase  solution  that  minimizes  the  optimal  LS  criterion  in  (5). 

The  model  order  selection  problem  has  not  been  addressed  in  this  work.  It  appears  that  for  this  essentially 
deterministic  problem,  Akaike  Information  Criterion  (AIC)  or  Minimum  Description  Length  Criterion  (MDL) 
may  not  be  applicable.  But  these  criteria  may  be  utilized  when  the  prescribed  impulse  response  data  consists  of 
true  impulse  response  embedded  in  Gaussianly  distributed  noise. 

The  algorithm  presented  in  this  Section  may  also  be  quite  useful  for  estimating  MA  filter  coefficients. 
Presently,  the  most  effective  algorithm  for  MA  modeling  is  Durbin’s  method  [11]  which,  in  fact,  relies  on  two 
steps  of  AR  parameter  estimation.  Traditionally,  AM  is  utilized  in  both  steps  of  Durbin’s  method  because  it 
produces  minimum-phase  polynomials  [7,  9-11].  But  the  estimates  obtained  using  AM  may  not  be  optimal  be¬ 
cause  the  true  impulse  response  fitting  error  norm  is  not  minimized.  But  the  algorithm  presented  here  produces 
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optimal  least-squares  AR  filter  coefficients  from  prescribed  impulse  response  data.  Hence,  it  can  be  expected  that 
the  introduction  of  the  proposed  AR  algorithm  in  one  or  both  stages  of  Durbin’s  algorithm  may  produce  more 
accurate  MA  parameter  estimates. 

It  has  been  shown  in  [7]  that  the  original  SMM  can  also  be  decoupled  into  a  linear  and  a  non-linear  subprob¬ 
lems.  In  a  recent  paper  [34],  the  strictly  proper  case  of  the  original  SMM  has  been  decoupled  somewhat  differently 
than  in  [7].  But  more  interestingly,  it  has  also  been  demonstrated  that  the  non-linear  part  of  the  decoupled  SM 
criterion  has  exact  mathematical  equivalence  with  the  optimal  EFM  criterion  in  [5].  It  appears  that  using  the 
new  definitions  of  the  matrices  resulting  from  the  matrix  partitioning  in  (19),  the  AR-version  of  SMM  as  given 
in  (8)  can  also  be  decoupled  into  linear  and  nonlinear  subproblems.  This  equivalence  may  have  an  important 
consequence  for  the  proposed  algorithm.  There  already  exists  a  convergence  analysis  of  the  original  SM  method 
in  [47].  It  can  be  hoped  that  the  convergence  analysis  will  also  apply  to  the  decoupled  form  of  SM  method  given 
in  [34].  If  that  happens  to  be  the  case,  as  alluded  to  in  [34],  the  convergence  analysis  in  [47]  should  also  apply  to 
the  iterative  computational  algorithm  presented  in  this  work.  It  should  be  noted  though  that  the  results  of  SMM 
and  the  proposed  optimal  method  may  not  be  identical.  Specifically,  the  numerator  in  the  decoupled  form  of  [34] 
is  computed  somewhat  differently  than  (26)  which  is  the  optimal  estimate.  Furthermore,  it  should  be  also  added 
that  the  iterative  scheme  in  (35a)  is  not  the  only  possible  approach  for  iterative  minimization  of  the  equivalent 
criterion  in  (22).  In  fact,  removing  the  requirement  of  do  =  1,  fbe  eigenvector  corresponding  to  the  minimum 
eigenvalue  of  the  matrix  may  also  be  used  as  the  estimate  at  the  (i-f  l)-th  iteration 

step  [50].  This  possibility  is  not  obvious  from  the  original  SMM  algorithm  in  [3]. 

IV.  Simulation  Results 

In  this  Subsection,  the  performance  of  the  proposed  algorithm  is  evaluated  by  means  of  several  AR(p)  model 
identification  examples  with  different  p  values.  6  =  10“^  was  used  as  the  stopping  criterion  in  both  phases  of  the 
algorithm  for  all  the  examples  below.  The  fitting-error  norm  defined  in  (5)  was  calculated  at  convergence  using 
the  estimated  parameters  and  the  results  are  tabulated  in  the  ‘Minimized  Error  Norm’  column.  Furthermore,  in 
order  to  get  a  relative  sense  of  performance,  the  logarithm  of  the  ratios  of  the  powers  of  the  ‘true  IR’  (known  in 
these  simulations)  and  the  error  powers  are  also  tabulated  in  the  ‘Closeness  in  dB’  columns. 

Simulation  1  : 

The  desired  impulse  response  has  a  Triangular  form  as  shown  by  the  solid  lines  in  Fig.  lA  -  ID,  The 
resulting  impulse  response  fit  using  Covariance  method  and  Auto-correlation  method  are  shown  as  connected 
circles  in  Figures  lA  and  IB,  respectively.  The  impulse  response  match  at  the  end  of  each  of  the  two  phases 
of  the  algorithm  described  in  Subsection  III  with  p  =  4  are  shown  in  Fig.  1C  and  Fig.  ID,  respectively.  The 
minimized  error  norm  and  the  closeness  of  the  fit  to  the  desired  signal  hp  are  listed  in  Table  1.  The  number  of 
iterations  for  convergence  are  also  listed.  It  can  be  seen  from  the  table  and  the  figures  that  compared  to  AM 
and  CM,  the  proposed  scheme  provided  more  accurate  estimates.  But  it  may  also  be  observed  that  there  is  no 
significant  difference  in  the  results  between  the  1st  and  the  2nd  phase  of  the  proposed  algorithm. 
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Table  1:  Example  1:  Comparison  of  three  methods  with  Triangular  Impulse  Response 


Method 

Closeness 
in  dB 

Minimized 
Error  Norm 

Number  of 
Iterations 

Covariance 

6.359 

154.9 

Auto- Correlation 

6.674 

144.093 

Proposed 

23.436 

3.037 

5 

Phase- 1 

Proposed 

23.439 

3.035 

3 

Phase-2 

Simulation  2  : 

An  arbitrary  impulse  response  was  generated  with  p  =  5  for  these  simulations.  If  the  algorithm  in  Subsection 
III  is  used  directly  to  match  the  true  response  it  will  give  perfect  results.  Instead,  Gaussianly  distributed  white 
noise  was  added  to  the  true  response  to  obtain  the  desired  response  hp.  Hence,  the  estimates  obtained  with  the 
proposed  algorithm  will  also  be  the  MLE  of  the  unknowns.  For  20dB  noise,  the  true  and  the  desired  responses 
are  shown  in  Fig.  2A.  The  impulse  response  match  using  Covariance  method  and  Autocorrelation  method  are 
shown  in  Figures  2B  and  2C,  respectively.  The  initial  estimate  obtained  by  minimizing  the  equation  error  in 
(24a)  is  shown  in  Fig.  2D.  The  impulse  response  fit  obtained  using  the  proposed  algorithm  at  the  end  of  Phase- 1 
and  Phase-2  are  shown  in  Fig.  2E  and  Fig.  2F,  respectively.  The  minimized  error  norms  and  the  closeness  to  the 
true  response  are  listed  in  Table  2.  It  can  be  observed  for  this  example  that  there  is  about  3dB  difference  in  the 
impulse  response  fit  between  the  two  phases  though  the  difference  in  the  minimized  error  norms  is  quite  small. 
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Table  2  :  Example  2  :  Comparison  of  three  methods  with  5-th  order  Impulse  Response 


Method 

Closeness 
in  dB 

Minimized 
Error  Norm 

Number  of 
Iterations 

Covariance 

1.027 

1.847 

Auto- Correlation 

12.442 

0.691 

Equation  Error 
in  (24a) 

15.459 

0.656 

Proposed 

Phase- 1 

17.889 

0.646 

5 

Proposed 

Phase-2 

20.883 

0.634 

4 

From  these  simulation  results  a  fair  conclusion  may  be  drawn  that  the  Phase- 1  of  the  algorithm  does  an 
excellent  job  of  ei*ror  minimization.  Hence,  the  Phase-2  of  the  algorithm  need  not  be  invoked  for  most  applications. 
The  results  using  SMM  are  close  to  the  results  at  the  end  of  Phase- 1  if  the  original  SMM  [3]  or  the  decoupled 
form  in  [34]  are  used.  There  were  some  numerical  differences  in  the  coefficients  but  the  impulse  response  fit  looked 
almost  alike.  The  results  with  the  AR- version  of  SMM  given  in  [7]  were  poorer  than  Phase-1  results  because  the 
numerator  coefficient  is  set  to  1  in  [7].  Extensive  simulations  with  other  examples  show  equivalent  performance. 
Interestingly,  the  simulations  also  indicate  that  the  proposed  algorithm  is  quite  immune  to  the  choice  of  initial 
estimates.  In  fact,  when  CM  or  AM  were  used  in  place  of  (32)  for  obtaining  the  initial  estimates,  the  results 
obtained  at  the  end  of  Phase-1  or  Phase-2  turned  out  to  be  exactly  identical  to  the  results  listed  in  the  Tables.  But 
with  Covariance  method  as  initial  estimate,  the  Phase-1  sometimes  took  one  or  two  extra  iterations  to  converge. 
This  important  observation  indicates  the  robustness  of  the  proposed  algorithm  to  the  choice  of  initial  estimates. 

V.  CONCLUDING  Remarks 

In  this  Section,  a  classical  rational  model  identification  problem  has  been  addressed.  The  major  focus  was 
to  develop  an  algorithm  for  optimal  estimation  of  the  parameters  of  an  all-pole  transfer  function  with  arbitrary 
number  of  poles  by  model-fitting  a  prescribed  impulse  response.  Unlike  some  existing  results,  no  linearization 
or  approximation  has  been  done  while  deriving  the  theoretical  optimization  criterion.  It  is  shown  that  the 
multidimensional  non-linear  problem  can  be  decoupled  into  two  smaller  problems  of  which  one  is  a  linear  problem 
and  the  other  one  is  a  non-linear  problem.  The  inherent  mathematical  structure  of  the  non-linear  part  is  utilized 
to  formulate  an  efficient  iterative  computational  algorithm  for  estimating  the  denominator  parameters.  Global 
optimality  properties  of  the  estimators  have  been  confirmed  by  relating  the  multidimensional  optimization  problem 
to  certain  well-known  results  in  numerical  analysis.  In  simulation  studies  also,  the  method  has  been  shown  to 
be  highly  effective.  Regarding  possible  future  work,  it  may  be  noted  that  most  of  the  existing  suboptimal  1-D 
algorithms  have  been  extended  for  estimating  2-D  filter  coefficients  from  2-D  spatial  domain  data  [22,  29,  35, 
36,  40-43].  The  possibility  of  formulating  an  optimal  2-D  AR-filter  design  technique  by  extending  the  proposed 
method  is  being  studied  [49].  Extension  of  this  work  for  identification  of  Multidimensional  AR-systems  from 
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multidimensional  impulse  response  data  [38]  is  also  under  progress. 
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ig.  1  :  Simulation  1  A  triangular  impulse  response  is  modeled  by  an  AR(4)  model.  S  =  10"®  was 
^  used  in  both  phases.  The  solid  lines  denote  the  prescribed  impulse  response  and  the  connected 
circles  show  the  lit  with  (A)  Covariance  Method,  (B)  Autocorrelation  Method,  (C)  after  Phase-1 
convergence  and  (D)  after  Phase-2  convergence  of  the  proposed  method. 
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Pig.  2  :  Simulation  2:.  20dB  note  wan  add.d  to  a  t,„e  All(5)  impnlse  response  to  form  the  desired 
response  (he).  S  _  10  was  used  as  stopping  criterion.  The  solid  lines  denote  the  true  impulse 
response.  In  Fig.  2A,  the  connected  circles  show  the  noisy  signal  hj.  In  the  other  plots  the 
connected  circles  show  the  fit  with  (B)  Covariance  Method.  (C)  Autocorrelation  Method  (D) 
minimization  of  Equation  error  in  (22f),  (E)  Phase-1  and  (F)  Phase-2  convergence. 
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Appendix  Computational  Algorithm  :  Phase  II 

The  second  phase  of  the  iterative  algorithm  is  described  in  this  Appendix.  In  this  phase,  the  derivative  of  the 
matrix  W  w.r.t.  b  is  taken  into  consideration  while  minimizing  the  fitting  error  norm.  The  complete  expression 
of  the  ^2-norm  of  the  error  can  be  written  as, 

llelli  =  e’’e  =  (Wg -I- WGdf  (Wg -b  WGd).  (A.l) 

By  setting  the  derivative  of  this  squared  norm  to  zero,  the  updated  at  the  (i  -|-  l)-th  iteration  is  given  by, 

b(‘+i)  =  _  [U(‘)G]-^[U(’)]g  (A.2) 

where  (suppressing  the  superscript  f’^), 

U  A  L^W  -I-  G^W^’W,  (A.2a) 
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4^  has  the  same  form  as  the  B  matrix 

oak 

appear.  For  example, 
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defined  in  (19a)  but  filled  with  all  zeros  except  at  the  locations  where  dj. 


— 1 
O 

...  1 

0 

...  0- 

0 

...  0 

1 

...  0 

0 

; 

..  : 

...  0 

0 

1 

0 

...  0 

0 

0 

0 

.0 

...  0 

0 

...  0. 

6  IR. 


NxN-l 


{A.2d) 


Once  is  found,  can  be  formed  as. 


b(‘+i) 
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-[UWG]-i[U«]g 


(^.3a) 


(A.Sb) 


This  minimization  phase  continues  until  b*'*'^  ~  b*  is  reached  and  this  optimum  b®  vector  corresponds  to  a 
minimum  of  the  error  surface  of  ||e||2. 
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Section  -  3.4  :  DESIGN  OF  DENOMINATOR-SEPARABLE  2-D  HR  FILTERS 


Summary 

Optimal  design  of  an  important  class  of  two-dimensional  (2-D)  digital  HR  filters  from  spatial  impulse  response 
data  is  addressed.  The  denominator  of  the  desired  2-D  filter  is  assumed  to  be  separable  into  two  1-D  factors. 
The  filter  coefficients  are  estimated  by  minimizing  the  ^2-norm  of  the  error  between  the  prescribed  and  the 
estimated  spatial  domain  responses.  The  denominator  and  numerator  estimation  problems  are  theoretically 
decoupled  into  separate  problems.  The  decoupled  criteria  have  reduced  dimensionality.  The  denominator  criterion 
is  simultaneously  optimized  w.r.t.  the  coefficients  in  both  dimensions  using  an  iterative  algorithm.  Thfe  numerator 
coefficients  are  found  in  a  straight-forward  manner.  If  the  desired  response  is  known  to  be  symmetric,  the 
proposed  algorithm  can  be  constrained  to  produce  optimal  denominators  which  are  identical  in  both  domains. 
The  performance  of  the  algorithm  is  demonstrated  with  simulation  examples. 

1.  Introduction 

Two-Dimensional  HR  filters  are  commonly  used  in  image  processing  and  2-D  filtering.  Synthesis  of  such 
filters  from  prescribed  spatial  domain  impulse  response  data  is  an  important  and  challenging  design  problem  and 
has  received  considerable  attention  in  recent  literature  [1,  2,  4,  5,  8,  12,  14,  16].  Spatial-domain  design  of  2-D 
HR  filters  is  analogous  to  1-D  recursive  filter  design  based  on  time-domain  specifications.  Most  2-D  filter  design 
algorithms  are  basically  extensions  of  existing  1-D  algorithms.  In  particular,  Shanks  ei  al  [12]  had  extended  the 
work  of  Shanks  [11];  Cadzow  [1]  and  Shaw  and  Mersereau  [16]  utilized  many  of  the  general  non-linear  optimization 
methods;  and  Shaw  and  Mersereau  [16]  also  extended  the  work  of  Steiglitz  and  McBride  [17].  The  1-D  work  of 
Mullis  and  Roberts  [7]  was  further  extended  and  applied  to  the  2-D  case  in  [5]. 

The  approaches  noted  above  do  not  minimize  the  true  spatial  impulse  response  error,  though  it  may  be 
mentioned  that  the  extension  of  Steiglitz-McBride  method  in  [16]  closely  approximates  the  true  fitting  error.  For 
the  strictly-proper  case,  i.e.,  when  the  numerator  order  is  one  less  than  that  of  the  denominator,  Evans  and  Fischl 
(EFM)  had  proposed  an  optimal  method  for  synthesis  of  1-D  HR  filters  [3].  The  2-D  filter  synthesis  algorithm 
presented  here  is  a  generalization  of  EFM  to  2-D.  Proposed  solution  is  optimal  in  the  sense  that  it  minimizes  true 
and  complete  spatial  error  criterion  for  the  design  of  strictly-proper  2-D  HR  filters. 

EFM  has  been  found  to  be  highly  accurate  for  1-D  filter  design.  A  modified  complex  version  of  the  EFM 
with  certain  symmetry  constraints  has  also  been  shown  to  be  effective  for  maximum-likelihood  1-D  and  2-D 
frequency-wavenumber  estimation  [  8,  13,  15].  Generalization  of  EFM  for  strictly-proper  2-D  filter  design  has  also 
been  considered  previously  [4,  5],  but  it  appears  that  the  full  potential  of  EFM  has  not  been  utilized  in  the  2-D 
case.  Specifically,  it  will  be  shown  that  the  complete  error  criterion  encompassing  the  entire  subspace  orthogonal 
to  the  model  fitting  error  was  not  optimized  in  [4,  5].  Instead,  two  subop timal  error  criteria  were  formed  in  each 
domain  and  the  filter  coefficients  were  optimized  in  the  two  dimensions  independently. 

In  this  Section,  a  2-D  version  of  EFM  is  developed  for  optimal  design  of  2-D  recursive  filters  from  prescribed 
spatial  domain  data.  The  complete  basis  space  orthogonal  to  the  spatial  fitting  error  will  be  identified  and  the 
corresponding  error  criterion  will  be  shown  to  be  dependent  only  on  the  2-D  filter  parameters.  Similar  to  1-D 
EFM,  the  non-linear  error  criterion  will  be  decoupled  into  a  purely  linear  and  a  non-linear  sub-problem.  For 
the  separable  denominator  case,  it  is  also  shown  that  the  error  vector  possesses  a  quasi-linear  relationship  with 
the  denominator  coefficients  in  both  domains  simultaneously.  Unlike  several  existing  2-D  methods  [1,  3,  4,  5], 
the  exact  fitting  error  is  minimized  w.ri,  the  filter  coefficients  in  both  dimensions  simultaneously.  Simultaneous 
optimization  is  particularly  effective  for  synthesizing  2-D  filters  with  symmetric  impulse  response  which  are  quite 
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common  in  practice.  In  such  cases,  the  criterion  can  be  constrained  to  produce  identical  denominators  in  both 
domains  ensuring  symmetry  in  the  estimated  spatial  response. 

The  Section  is  arranged  as  follows  :  In  Subsection  II,  the  least-squares  problem  is  stated.  In  Subsection  III, 
the  preliminaries  for  the  non-separable  numerator  and  separable  denominator  case  is  given.  In  Subsection  IV, 
the  new  orthogonal  basis  spaces  are  defined,  the  error  criterion  is  derived  and  the  computational  algorithm  is 
summarized.  In  Subsection  V,  some  simulation  results  are  given  and  finally,  in  Subsection  VI  some  concluding 
remarks  along  with  directions  for  future  work  are  included. 


/ 


II.  Problem  Statement  emd  Formulation 

In  general,  a  2-D  rational  function  H{zi,Z2),  with  non-decomposable  numerator  and  denominator  is  described  as 

(1) 


,  _  QizuZ2)  ^ 

-  Piz^,z,)  ErJoEr=oP(biK’’-2-' 


Note  that  for  the  strictly-proper  case  of  EFM,  ni  =  mi  —  1  and  ^2  =  m2  —  1.  If  the  ki  x  k2  first  quadrant  samples 
are  assumed  to  be  significant,  .^^(^1,2:2)  can  also  be  written  as, 


where,  zi  A  [1 


-1 


H{zi,Z2)  =  zfHz2 
212  A  [1  and 


H  A 


h(0, 0) 
ft(l.O) 


h(0,l) 

A(l,l) 


.h(jfci-l,0)  /i(i(ri-l,l) 
Define  a  vector  by  stacking  the  columns  H  as  follows  : 


h{0,  k2  -  1) 

h{hk2-l) 

h{h-l,k2-l)} 


h  A 


phi  1 

h2 

Lhfc, 


(2) 


(3) 


(4) 


where,  h<  denotes  the  i"*  column  of  H.  Next,  let  the  presented  space-domain  impulse  response  matrix  be  denoted 
as. 


X  A 


*(0,0)  *(0,1) 

*(1,0)  *(1,1) 


*(0,A;2  -  1) 
*(1,1;2  -  1) 


L*(jki-1,0)  *(I:i-l,l) 

A  [Xi  X2  •  •  •  XiJ 


*(ifci-l,l:2-l)J 


(5a) 

(56) 


and  the  corresponding  vector  be  formed  as, 


rxr 

X2 


Lxfcj  J 


In  this  Section,  the  following  2-D  least-squares  synthesis  problem  is  addressed  : 


(6) 
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Given  the  2-D  spatial  impulse  response  matrix  X,  estimate  p  and  q  by  optimizing  the  following  error  criterion, 


min||e|p  A  ||x  -  hjp  with,  p(0,0)  =  1,  where, 

q,p  = 

(7a) 

q  A  [g(0, 0)  5(0, 1)  •••  g(ni,n2)]^  and 

(76) 

p  A  [p(0,0)  p(0,l)  •••  p(mi,m2)]^ 

(7c) 

This  problem  is  nonlinear  in  p  and  standard  gradient-based  optimization  algorithms  have  been  used  for  1-D  as 
well  as  for  2-D  designs  [1,  2,  16].  But  these  generic  algorithms  do  not  make  effective  use  of  the  matrix-structures 
inherent  in  this  particular  problem  and  they  are  known  to  be  sensitive  to  initialization.  Several  feub-optimal 
algorithms  based  on  linearization  of  the  true  non-linear  criterion  have  also  been  proposed  [2,  11,  16,  17].  In  this 
work,  the  exact  fitting  criterion  defined  in  (7)  will  be  theoretically  decoupled  into  a  purely  linear  problem  for  q 
and  a  non-linear  problem  for  p.  Furthermore,  the  non-linear  criterion  will  be  shown  to  possess  a  quasi-linear 
relationship  to  the  unknown  denominator  coefficients.  This  will  lead  to  the  formulation  of  an  iterative  algorithm 
for  its  minimization. 

III.  Design  With  Separable  Denominator  and  Non-Separable  Numerator 
In  this  case,  the  2-D  rational  transfer  function  can  be  written  as. 


H{zi,Z2)  = 


ETJoCii)zrET:od{j)^2^  ■ 


(8a) 


Define, 


c  A  [c(0)  c(l)  ...c(mi)]^  and  (86) 

d  A  [d(0)  d(l)  ...d(m2)]^.  (8c) 


Multiplying  both  sides  of  (8a)  by  YTJo  c(*>r’  ^  equating  the  coefficients  of  the  same  powers  of 

[5], 


[Df  (g)  Cf]h  =  q  (9a) 

[D?’  ®  C^]h  =  0  (96) 

[D^  ®  Cf’jh  =  0  (9c) 

[D^  (g)  C^]h  =  0  (9d) 

[h,  ®  C^]h  =  0  (9e) 

[D^(g)Ijfei]h  =  0  where,  (9/) 
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where,  Ik^  €  and  Ijb2  G  ^re  identity  matrices  and  <g)  denotes  the  Kronecker  product  [9].  If  h,  c 

and  d  are  known,  the  numerator  vector  q  can  be  calculated  using  (9a).  But  in  practice,  h,  c  and  d  need  to  be 
estimated  from  the  prescribed  response  x.  If  h  is  replaced  by  the  prescribed  x  in  (9b)-(9f),  the  right  hand  sides 
will  not  be  equal  to  zero.  Instead,  it  will  result  in  the  following  equation  errors  : 


eq 


[Df  0  C^]x  =[Bj  0  C^][h  +  e]  =  [Dj  0  A  e' 
[D^  (8»  Cf]x  =[D^  (g)  C?’][h  +  e]  =  [D^  (g  Cf]e  A 
[D^  (g  C^]x  =[D’’  (g  C^][h+  e]  =  [D^  (g  C^]e  A 
[I,,  (g  C^]x  =[Ifc,  0  C^][h  +  e]  =  [Ifc,  0  C^]e  A  e% 
[D’’  0  h,]x  =[D’’  0  /tj[h+  e]  =  [D^  0  /fcje  A  e%. 


(11a) 

(116) 

(11c) 

(lid) 

(lie) 


In  (lla)-(lle),  the  fact  that  x  =  h  +  e  and  the  orthogonal  relationships  in  (9b)-(9f)  have  been  utilized.  These 
equations  show  the  relationships  between  the  fitting  error  e  and  equation  errors.  As  in  case  of  1-D  EFM  [7],  in 
order  to  minimize  \\e\\^  an  inverse  relationship  of  the  form. 


e  =  W(c,d)e 


eq 


(12) 


need  to  be  found.  The  matrix,  W(c,d)  needs  to  be  constructed  using  c  and  d. 

The  problem  of  determining  the  denominator  coefficient  vectors  c  and  d  is  essentially  equivalent  to  the 
search  for  {kik2  —  mi  m2)  linearly  independent  vectors  orthogonal  to  h.  These  orthogonal  basis  space  must  be 
dependent  on  the  elements  in  c  and  d  only.  Equation  (9a)  clearly  shows  that  Di  0  Ci  G  ^1^:2  x  mi  m2  can  not  be 
orthogonal  to  h.  On  the  other  hand,  (9b)-(9f)  demonstrate  that  the  matrices  Di0C,  D0Ci,  D0C,  1*2 OC  and 
0  7*1  are  indeed  all  orthogonal  to  h.  But  summing  the  respective  number  of  columns  of  these  five  full-rank 
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matrices,  the  total  turns  out  to  be  (3kik2  -  mim2  -  kim2  -  k2mi)  vectors  of  length  ^1*2  each.  Hence,  these  set 
of  matrices  can  not  all  be  linearly  independent  of  each  other.  In  [5],  the  matrices,  Di  (g)  C,  D  ®  Ci,  D  ®  C,  as 
formed  in  (9b)-(9d),  respectively,  were  utilized  to  form  an  inverse  relationship  as  required  by  (12).  These  set  of 
matrices  do  contain  (^1^2  —  77111712)  linearly  independent  vectors  of  length  kik2y  but  unfortunately,  they  are  not 
orthogonal  to  each  other. 

It  may  be  noted  here  that,  in  a  previous  2-D  generalization  of  EFM  [4],  the  complete  spatial  fitting  error 
criterion  was  not  formulated.  In  a  later  work,  it  was  partially  formulated  in  equation  (14)  of  [5],  but  no  algorithm 
was  presented  for  minimizing  that  criterion  with  respect  to  the  unknown  parameters.  Instead,  in  both  those 
works,  two  separate  criteria  were  minimized  independently.  Specifically,  (9e)  and  (9f)  were  used  in  [4,  5]  to 
estimate  c  and  d  using  two  independent  optimizations.  But  Ij;^  ®  C  and  D  ®  hi  contain  (^1^2  -  and 

—  m2ki)  linearly  independent  vectors,  respectively.  The  entire  {k\k2  —  77117712)  dimensional  vector-space 
orthogonal  to  h  was  not  optimized  simultaneously  w.r.t  c  and  d.  It  is  not  apparent  if  the  optima  of  these  separate 
criteria  are  identical  to  those  of  the  true  2-D  criterion.  In  Subsection  IV,  a  new  set  of  orthogonal  vectors  will  be 
constructed  which  will  lead  to  the  formulation  of  an  exact  2-D  spatial  fitting  error  criterion  that  can  be  optimized 
simultaneously  w.r.t.  c  and  d. 

IV.  Formulation  of  the  Orthogonal  Basis  Space  : 

According  to  orthogonality  principle  [15],  the  fitting  error  e,  at  minimum,  ought  to  be  orthogonal  to  the 
‘estimated’  h  that  minimizes  the  error.  It  is  also  desirable  to  have  the  resulting  error  criterion  dependent  on  the 
denominator  coefficients  only.  To  meet  these  requirements  two  Vandermonde  matrices  are  formed  as  follows, 


■  1 

1  ■ 

■  1 

1  ■ 

h 

•  •  ^mi 

91 

9m2 

T  A 

»  •  ‘’mi 

and  Q  A 

9i 

•  •  •  9m, 

fki-1 

L^i 

^ki-l 

‘'mi  J 

.9^"' 

where,  U  =  and  g.-  =  -  l,...,m2  be  the  roots  of  the  polynomials  C{zi)  = 


‘^(O^r*  3.nd  D(z2)  =  ‘>  respectively.  Hence,  by  construction, 

C^T  =  0  and  (14®) 

D’’Q  =  0  (146) 

Furthermore,  using  (9e)  and  (9f), 

[Q^^C^'jh  =  [Q’’®I][I®  C^jh  =  0  and  (15a) 

[D^  ®  T’’]h  =  [I  ®  T’’][D^  ®  I]h  =  0.  (156) 


The  orthogonality  relationships  in  (9d),  (15a)  and  (15b)  demonstrate  that  the  three  matrices  Q  ®  C,  D  0  T 
and  D  0  C  together  constitute  {kik2  —  77117772)  dimensional  vector  space  orthogonal  to  h.  Interestingly,  these 
matrices  are  not  only  formed  with  linearly  independent  columns  they  are  also  mutually  orthogonal  to  each  other. 
This  orthogonality  claim  can  be  easily  substantiated  as  follows  : 

[Q^0C^][D0T]  =  [Q^D0C^T]  =  [0  0  0]  =  0, 

[Q^0C^][D0C]  =  [Q^D0C^C]  =  [O0C’^C]  =  0 
[D^0T^][D0C]  =  [D^D0T^C]  =  [D^D0O]  =  0. 
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and 


(16a) 

(166) 

(16c) 


It  should  also  be  mentioned  here  that  the  matrices  T  and  Q  are  useful  only  in  this  intermediate  stage  of  deriving 
the  2-D  error  criterion  and  they  will  not  be  needed  in  the  final  optimization  steps. 

If  the  vector  h  in  (15)  is  replaced  by  x,  then  similar  to  (11),  the  following  equation  errors  are  formed  : 

X  =  e  A  ee,.  (17) 

Vd^®C^/  Vd^(8.C^/  - 


The  optimized  e  must  be  orthogonal  to  the  estimated  h,  whereas  the  three  matrices,  Q  0  C,  D  (g)  T  and  D  ®  C, 
were  shown  to  be  orthogonal  to  h  in  (9d),  (15a)  and  (15b),  respectively.  Hence,  e  can  be  constructed  as  a  linear 
combination  of  the  columns  of  these  orthogonal  set  of  matrices,  i.c., 

e=(Q®C  D(8)T  D(8)C)f  (18) 


where, 

f  A  [/i  /2  ■  •  •  (1^) 

is  a  vector  of  constants  which  are  to  be  determined.  Using  this  form  of  e  in  (17),  the  equation  error  can  be  written 


as. 


D^(8»C^ 


X  = 


'Q^Q®C^C  0  0 

0  D^’D  (g)  T^T  0 

0  0  D^D(g)C’’C, 


f  A  Geq- 


(20) 


The  matrix  on  the  r.h.s.  is  square  block- diagonal  with  square  diagonal  blocks  and  hence  it  can  be  inverted  to 
uniquely  determine  the  vector  of  constants  f  as. 


/(Q^Q)"^  ®(c’’c)"‘  0  0 

f=  0  (D^D)-i(g)(T^T)-i  0 

\  0  0  (D^D)-i(g)(C^C)-\ 

/(Q^Q)-iQ’’®(C^C)-iC^\ 

=  I  (D^D)-^D^  ®  (TTt)-It’’  X. 

\  (D^D)-^D^  ®  (C^C)-^D^  / 

Using  this  expression  of  f  in  (18)  the  fitting  error  becomes, 

e  =  ['Pq  ®  Vc  +  Pd  ®  Pt  +  Pd  ®  Pc]x 


’  \ 

®  X  (21o) 


(215) 


(22) 


where,  P[.]  denotes  the  Projection  matrix,  e.g.,  Pc  A  C(C’’C)“^C^.  Unfortunately,  this  error  vector  is 
dependent  on  T  and  Q  which  must  be  removed.  According  to  (14),  the  matrices  C  and  D  are  orthogonal  to  T 
and  Q,  respectively.  Furthermore, 


Hence,  using  a  theorem  on  Projection 


rank(T)  +  rank(C)  =  ki  and 

(23a) 

rank(Q)  +  rank(D)  =  ^2 

(236) 

matrices  [14], 

Vc  +  Vt 

(24a) 

Vd  H-  Vq  =1*2- 

(246) 
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Using  these  relationships  in  (22),  Tr  and  ’Pq  can  now  be  replaced  and  the  error  vector  can  be  written  as, 


e  =  [(It,  -  Pd)®  Pc  +  Pd®  (It,  -  Vc)  +  Pd®^c]x  (25a) 

=  [It,®Pc  -  Pd®^c  +  PD®It,  -  VTi®Vc  +  Pd®^’c]x  (256) 

=  [Itj  ®  Pc  +  Pd  ®  It,  -  Pd  ®  ^’cjx.  (25c) 

Note  that  in  this  final  form  of  the  error,  there  is  no  dependence  on  either  T  or  Q.  Hence,  the  error  criterion  for 
determining  the  denominator  coefficient  vectors  c  and  d  can  now  be  written  as, 

mine||(c,d)||^  =  min(x^[It,  ®Pc +  'Pd  ®Iti  -  Pd  ®'Pc]x)  (26) 

c,d  c,d 


Equations  (26)  and  (9a)  represent  the  desired  decoupled  criteria  for  determining  the  denominator  and  numerator 
coefficients,  respectively.  Optimization  of  (26)  would  produce  the  optimal  c  and  d,  denoted  as,  c°  and  d®, 
respectively.  Letting  e®  denote  the  minimized  error  corresponding  to  the  optimum  denominator  coefficients,  the 
optimum  spatial-response  vector  h  can  be  found  from, 

h®  A  X  -  e®.  (27) 

This  h®  can  then  be  used  in  (9a)  to  obtain  the  optimal  numerator  vector,  q®. 

Analyzing  the  criteria  in  (26)  it  is  apparent  that  the  first  two  terms  are  the  orthogonal  projections  of  the  data 
X  on  to  the  parameter  spaces  of  each  of  the  two  spatial  dimensions.  The  third  term  is  the  orthogonal  projection 
common  to  both' dimensions  but  is  subtracted  once  because  the  common  (or,  joint)  projections  have  already  been 
included  once  in  each  of  the  first  two  terms.  It  is  very  interesting  to  note  that  this  criterion  is  quite  analogous 
to  the  standard  formula  of  the  Probability  of  Union  of  two  subsets.  It  may  be  emphasized  here  that  this  form  of 
the  error  criterion  is  not  only  mathematically  appropriate  it  is  intuitively  appealing  as  well  and  this  form  of  the 
2-D  error  criterion  was  not  arrived  at  in  any  of  the  previous  generalizations  of  EFM  [4,  5].  With  further  algebraic 
manipulations,  the  error-vector  ©  can  also  be  shown  to  be  related  to  both  the  denominator  vectors  c  and  d  in  a 
jttasi-linear  manner  as, 

e(c,  d)  A  ( (It,  -  Pd)  ®  W(c))Xi  (W(d)  ®  I*.  )X2  )  (^ ^ )  ,  (28a) 

where, 

W(c)  A  C(C^C)“‘  and  (286) 

W(d)  A  D(D^D)-‘.  (28c) 

X^  and  X^  are  constructed  from  the  elements  in  X  as  [9,  17], 

XiA(Xf  ■■■  Xlf,  (29) 

where,  the  (/,  kf'  term  of  Xj  is  formed  as, 

Xi(/,  k)  A  Xj(/  -  fe  -h  mi  +  1),  for  i  =  1,2, . . .,  I;2,  and  (30a) 

Xm2+2  •  • •  ^2 

^k2  •  ♦  •  XA;2  — m2 
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Equation  (28a)  is  one  the  key  results  derived  in  this  Section.  It  clearly  shows  that  both  the  unknown  coefficient 
vectors  c  and  d  appear  siiriultsineously  in  a  ^quasi-linear^  relationship  w.v.i.  the  true  error  vector.  This  quasi-linear 
relationship  can  be  exploited  for  simultaneous  optimization  of  the  criterion  in  (26)  w.vA.  c  and  d.  The  algorithm 
is  similar  in  flavor  to  the  ones  in  [3-5,  8,  13],  except  that  both  the  denominators  are  optimized  and  estimated 
simultaneously.  Specifically,  the  algorithm  iteratively  minimizes  the  ®f  Ike  error  vector  formed  in  (26) 

in  two  phases.  In  Phase-1,  the  matrices  are  treated  as  constants  and  are  formed  using  the  estimates  of  c  or 
d  obtained  at  the  previous  iteration.  In  Phase-2,  the  estimates  are  improved  upon  by  setting  the  gradient  of  the 
complete  error-norm  to  zero.  The  iterations  are  initialized  by  setting  c  =  [1  0  ...  0]^  and  d  =  [1  0  ...  0]  . 
The  iterations  are  continued  until  the  changes  in  the  estimates  in  successive  iterations  become  very  small.  It  may 
be  noted  here  that  extensive  simulation  experience  in  1-D  [7,  8,  13,  14]  as  well  as  for  2-D  cases  [13,  15]  indicate 
that  Phase-1  itself  produces  very  good  estimates  of  the  filter  coefficients  and  in  most  cases,  there  may  not  be  any 
need  for  invoking  Phase-2  at  all.  It  may  be  noted  here  that  in  [4,  5],  the  complete  error  e(c,  d)  in  (26)  was  not 
optimized. 

Symmetric  Spatial  Response  -  A  Special  Case  :  Many  2-D  filters  used  in  image  processing  are  symmetri¬ 
cally  shaped  in  the  spatial  domain.  Some  notable  examples  are,  Gaussian  and  Circular  filters.  In  designing  such 
spatially  symmetric  2-D  filters,  the  methods  in  [4,  5]  sometimes  produced  slightly  different  sets  of  denominator 
polynomials.  Hence,  the  estimated  spatial  response  may  not  possess  the  desired  symmetry.  This  problem  may 
be  attributed  to  separate  estimation  of  the  individual  denominators.  In  the  proposed  approach,  both  the  denom¬ 
inators  are  optimized  simultaneously  by  minimizing  the  entire  error  in  (28a).  If  necessary,  the  desired  symmetry 
may  be  imposed  by  setting,  c  =  d  in  (28a)  at  the  outset.  For  this  special  but  very  important  special  case,  (28a) 
would  have  the  following  form  : 

e(c)  A  [((Ifc,  -  Pc)®Wfc,(c))Xi  ^  (Wfc,(c)®Ifc.)X2]c,  (31) 

where,  the  subscripts  of  W  denote  leading  dimensions  which  may  be  unequal.  Minimization  of  the  norm  of  the 
error  in  (31)  will  result  in  a  single  set  of  optimal  coefficients  meant  for  both  dimensions.  This  is  one  of  the  major 
advantages  of  the  proposed  approach  over  the  ones  in  [4,  5]  where  separate  optimization  in  each  domain  does  not 
necessarily  guarantee  identical  denominator  coefficients  in  both  domains. 

V.  Simulation  Results 

In  order  to  demonstrate  the  effectiveness  of  the  proposed  algorithm,  the  results  of  the  design  of  a  Gaussian  Filter 
are  given  here.  The  spatial  response  of  a  quarter  plane  Gaussian  filter  defined  over  the  first  quadrant  is  given  by 

=  0.256322  e[-‘>  i°3203{(£-4)»+(i-4)=}]^ 

where,  {i,j)  G  Sj  and  the  support  Sj  is  given  by  Sj  =  {{ij)  1  0  <  i  <  14;  0  <  j  <  14}.  The  true  or  the 
desired  spatial  response  is  shown  in  Fig.  1.  Note  that  the  spatial  response  is  symmetric  around  its  center  point. 
Fig.  3  through  5  show  the  estimated  responses  for  filter  orders  (mi  =  m2),  4,  5  and  6,  respectively.  The  results 
were  obtained  by  minimizing  the  norm  of  the  error  vector  in  (31)  with  different  orders.  The  algorithm  converged 
in  5-7  iterations.  The  plots  clearly  show  that  the  estimated  spatial  impulse  responses  match  the  desired  one 
quite  closely  and,  as  can  be  expected,  the  match  improves  as  the  filter  order  increases.  With  sixth-order  the 
difference  between  the  true  and  the  estimated  response  is  almost  negligible.  The  closeness  between  the  true  and 
the  estimated  response  was  also  measured  in  terms  of  the  ratio  of  the  power  of  the  true  response  to  that  of  the 
errors  in  each  case.  The  ratios  were  found  to  be  about  41.2dB,  61.4dB  and  86.5dB  for  filter  orders  4,  5  and  6, 
respectively.  Simulations  with  other  forms  of  2-D  filters  also  showed  similar  performance. 
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VI.  Conclusion  and  Future  Work  ; 

An  optimal  2-D  HR  filter  design  method  has  been  presented.  The  algorithm  is  a  2-D  extension  of  an 
existing  optimal  1-D  approach.  The  2-D  model-fitting  criterion  has  been  decoupled  into  a  linear  and  a  nonlinear 
sub-problems.  The  non-linear  part  has  been  shown  to  possess  a  quasi-linear  relationship  with  the  unknown 
denominator  coefficients.  The  algorithm  simultaneously  optimizes  the  coefficients  in  both  dimensions.  Regarding 
future  work,  it  may  be  noted  that  similar  to  EFM  [3],  the  proposed  algorithm  is  also  applicable  for  strictly-proper 
designs  only,  albeit  in  2-D.  Recently,  an  optimal  1-D  algorithm  (OM)  which  is  applicable  for  any  general  system 
with  arbitrary  number  of  poles  and  zeros,  has  been  presented  in  [14].  Unlike  EFM,  the  general  1-D  algorithm 
in  [14]  formulates  the  criterion  entirely  differently.  It  shows  explicitly  that  the  true  error  is  linearly  related  to 
the  numerator  coefficients  whereas  the  denominator  is  nonlinearly  related.  The  possibility  of  extending  this  work 
for  designing  2-D  filters  with  any  arbitrary  numbers  of  denominator  and  numerator  orders  is  presently  under 
investigation. 
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Fig.  1  :  The  True  Spatial-Domain  response  for  a  t?*  o 

15  X  16  Gausdaa  Mter.  ^  Estimated  SpatiaJ-Domain  respoate 

with  mi  =  m  =  4 


Fig.  3  :  The  Estimated  SpatiaJ-Domain  response 
with  mi  =  m  =  5 


Fig.  4  .  The  Estimated  Spatial-Domain  response 
with  mi  =  m  =  6 
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Section  -  3.5  :  OPTIMAL  FREQUENCY  DOMAIN  DESIGN  OF  DENOMINATOR  SEPARABLE  TWO- 
Dimensional  Digital  HR  Filters 

SUMMARY 

Classical  design  techniques  using  Butterworth,  Chebyshev  or  Elliptic  polynomial  are  only  limited  particular 
types  of  design  specifications,  such  as  Bandpass,  lowpass  etc.  A  least-squares  technique  is  presented  for  designing 
quarter-plane  separable-denominator  2-D  HR  filters  to  best  approximate  prescribed  frequency  domain  (FD)  spec¬ 
ification  of  any  arbitrary  shape.  Structured  Matrix  Approximation  approach  is  utilized  to  show  that  the  FD  error 
vector  is  linearly  related  to  the  2-D  numerator  coefficients  whereas  the  relationship  with  the  2-D  denominators  is 
quasi-linear.  Furthermore,  the  numerator  and  denominator  estimation  problems  are  theoretically  decoupled.  The 
quasi-linear  relationship  with  the  denominator  is  used  to  formulate  an  algorithm  for  iterative  estimation  of  the 
denominator.  The  numerator  is  found  in  one  step  using  the  estimated  denominator.  Computer  simulations  show 
the  effectiveness  of  the  proposed  method  and  its  superior  performance  compared  to  several  existing  methods. 

I.  Introduction  : 

Design  of  2-D  digital  HR  filters  from  arbitrary  frequency  domain  specifications  is  a  highly  nonlinear  opti¬ 
mization  problem  [1-10].  Existing  designs  make  use  of  variations  of  general  nonlinear  optimization  techniques, 
such  as  Newton- Raphson  or  Fletcher-Powell  or  linear  programming  to  meet  the  prescribed  design  specifications 
[3,  4,  6-8].  But  these  general  methods  are  computationally  intensive,  highly  sensitive  to  the  choice  of  initial 
estimates  and  njay  take  large  number  of  iterations.  Also,  none  of  these  methods  make  use  of  the  underlying 
matrix-structure^  inherent  in  the  2-D  filter  design  problem.  For  Spatial  Domain  designs,  it  has  been  shown  by 
several  researchers  (including  the  second  author)  that  appropriate  utilization  of  the  underlying  matrix  structures 
leads  to  insightful  theoretical  framework  and  efl&cient  computational  algorithms  [1,  2,  5,  9,  10].  In  practice  though, 
the  filter  specifications  are  usually  prescribed  in  the  frequency- domain  and  hence,  direct  design  in  the  frequency 
domain  would  certainly  be  more  desirable.  To  the  best  knowledge  of  the  authors  no  structured  matrix  framework 
has  been  developed  for  frequency- domain  2D  HR  filter  design.  The  primary  goal  of  this  work  is  to  fill  this  gap 
by  demonstrating  that  an  equivalent  structured  matrix  framework  does  exist  in  the  frequency- domain  also  and 
furthermore,  it  can  be  utilized  equally  effectively  for  designing  2D  HR  filters.  Though  the  proposed  framework 
can  be  adapted  for  general  cases,  we  present  the  design  of  denominator-separable  filters  here  because  the  inherent 
symmetry  in  many  commonly  used  2D  filters  conform  to  the  separable-denominator  structure  and  the  stability 
of  these  filters  can  be  easily  verified. 

This  work  shows  that  the  optimal  2D  rational  model  identification  problem  belongs  to  a  special  class  of  mixed- 
nonlinear  optimization  problem  where  the  linear  and  nonlinear  parameters  appear  separately.  Furthermore,  the 
mixed  nonlinear  criterion  can  be  decoupled  into  a  purely  linear  problem  for  estimating  the  numerator  and  a 
separate  nonlinear  problem  of  reduced  dimensionality,  for  estimating  the  separable  denominators.  The  matrix 
structure  of  the  nonlinear  denominator  criterion  naturally  leads  to  an  iterative  algorithm  whereas  the  numerator 
is  estimated  with  a  single  step  of  Least-Squares  estimation.  In  simulations,  the  proposed  approach  provides 
superior  match  than  various  existing  general  approaches. 

II.  Problem  Definition  : 

The  transfer  function  of  a  2-D  separable- denominator  LSI  system  is  given  by 

_  A{zi,Z2)  _  ^  zfAza 

b(zi)C(z2)  Er=oK0V  Er=o<iK'' 
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(1) 


where,  b  A  [6(0)  6(1)  •  •  •6(mi)]^,  c  A  [c(0)  c(l)  •  • -£(012)]^, 


a(0, 0)  a(0, 1) 
a(l,0)  a(l,l) 


a(0,  m2) 
a(l,m2) 


La(mi,0)  a(mi,l)  •••  a(mi,m2)J 

and  zi  and  Z2  are  vectors  of  the  form  z,-  A  [1  zf^  ■  ■  with  appropriate  sizes.  Let  the  h  x  *2  desired 
frequency  response  be 

a:(wii,W2i)  ic(wii,W22)  ••• 

Xd  A  :  :  ■  • .  : 

_a;(wiitj,W2i)  a:(wiib,,W22)  •••  a:(wui,a)2fc3). 

and  the  frequency  response  of  the  separable-denominator  filter  at  the  same  frequency  points  be  X.  Let 
Xd  A  t;ec(Xd)  and  x  A  uec(X).  The  problem  is  to  estimate  the  coefficients  in  b,  c,  and  A  by  optimizing 
the  following  2-D  least-squares  error  criterion, 

min  ||e||^  A  ||xd  -  x||^  with  6(0)  =  l,c(0)  =  1.  (4) 

b,c,A  = 

III.  Decoupling  the  error- criterion  : 

) 

Let  Hi,{zi)  and  He{z2)  be  the. inverse  filters  of  B(zi)  and  C{z2)  respectively  i.e.,  B{zi)Hs{zi)  =  1  and 
C{z2)Hc(z2)  =  1-  The  system  function  can  therefore  be  written  as 


H{zi,Z2)  = 


A{zi ,  2:2) 

S(Zl)C'(Z2) 

L/*\  —i  l)  (  1 »  2)  c(  2) 

E,=oH*Fi  Ei=oC(j)^2 


Assuming  ki  x  k2  significant  samples  for  the  spatial  response,  the  above  relation  can  be  expressed  in  matrix 
notation  as 

H  =  H^rAHf  (6) 


where. 


/»(0,0)  h(0,l) 

h(l,0)  /»(1,1) 


h(0,iS:2-l) 

6(1.  ^2-1) 


.h(ifei-l,0)  6(61-1,1)  •••  6(61-1,62-1)- 


■  h{0) 

0 

0 

66(1) 

66(0) 

0 

Hl(f,i)  A 

66(ni) 

hbini  -  1) 

66(0) 

.66(61-1) 

66(61-2) 

•  •  •  hb{ki  —  ni  —  1)_ 
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■  h,(0) 

0 

0 

ftc(l) 

hc{0) 

0 

Hi  A 

hcin2) 

hc{n2  -  1) 

hi{0) 

.Mfcz-i) 

hc{k2  -  2) 

•••  hc(k2  -  712  -  1))  _ 

The  frequency  response  of  the  2-D  filter  can  be  written  in  a  matrix- decomposed  form  as, 


X  =  WfcHW^ 


(9) 


(10) 


where, 


Wj  A 

'1 

and  Wc  A 

— 

.  1 

g-i(fc2-lV2fc2 

(11) 


Applying  the  vec  operator  on  both  sides  of  (10),  we  get 


vec(X)  A  X  =  t;ec(Wi,HWj)  =  t)ec(WiHj^AHf’W^)  =  (WcH^  ®  Wi,Hj;)i;ec(A)  =  (WeH^O  Wi,Hi)a. 

(12) 

Hence,  the  error '.between  the  desired  and  the  filter  frequency  response,  as  defined  in  (3),  can  be  written  as, 


e  =  Xd  -  X  =  xa  -  (W,Hi  0 


(13) 


This  expression  shows  explicitly  that  the  frequency  domain  error  is  linearly  related  to  the  numerator  coefficient 
vector  a  and  nonlinear ly  related  to  the  denominators  in  a  rather  complicated  manner.  Interestingly,  if  the 
denominator  coefficients  are  known,  the  least-squares  estimate  of  the  numerator  coefficients  can  be  obtained  by 
minimizing  (4), 

a  =  (WcH^0W6H^)^Xd  (14) 

where  #  denotes  the  pseudo-inverse  of  the  matrix.  Substituting  this  in  (10),  we  get  the  decoupled  denominator 
criterion, 

lle(b,c)||2  ^  ||xd-(W,Hi®WjHl)(WeH^®W6Hl)#Xd|p  =  ||(I;fe>ife,  -  (Pw.H=  ®  Pw.Hl))xd||^  (15) 

where,  Py  A  Y(Y^Y)“^Y^  denotes  the  projection  matrix  of  a  matrix  Y  with  H  being  the  conjugate-transpose 
operator.  Elrtending  Theorem  2.1  in  [12],  it  can  be  shown  that  if  the  denominator  is  estimated  by  minimizing 
criterion  in  (15)  and  that  estimate  is  used  in  (14),  the  estimates  retain  the  global  optima  of  the  original  criterion 
in  (4). 

IV.  Reparametrization  of  the  error-criterion  : 

In  this  section  the  decoupled  criterion  in  (12)  will  be  directly  related  to  the  denominator  coefficients.  The 
inverse  filter  relation  B(zi)Hh{zi)  =  1  can  be  expressed  in  matrix  notation  as 


=  Ijfei 


(16) 
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where,  B/,  A 


6(0) 


b{ni  + 1) 
6(mi) 

0 


6(0)  0 
6(0)  0 
6(mi)  . 


0 

6(0) 


(17) 

A  [Hi  I  Hjj] 


(18) 

Let,  WjjWj  =  lit,.  This  inverse  exists  because  the  frequencies  unit’s  are  distinct.  Using  it  in  (16)  with  (17)  and 
(18). 

BuWj.WHj;  I  B„W6,WH{j 

Wi,W[H‘,  I  H^j] 


■  ^6(0) 

0 

0 

* 

0 

•  0  ■ 

and  A 

fet(ni) 

/li(ni-l)  ••• 

• 

h{0) 

0 

: 

.6»(fci-l) 

hiiki-2)  ••• 

: 

—  Til  “•  1) 

hb{ki  —  ni  —  2) 

•  /»ft(0). 

BWj^WtHj  =  lib, 


B^ 


B^Wi,WH*r 


I 


(19) 

The  bottom-left  corner  element  of  the  matrix  at  right  suggests  that  W^B  and  WH^  are  orthogonal,  t.e., 
(W«B)"(WHl)  =  0.  Also,  since  ranfc(W^B)  +  ranfc(WH2^)  =  {ki  —  ni  —  1)  -f-  (ni  +  1)  —  fci,  using  a  theorem 
on  projection  matrices, 

Pw^B  +  “  Ifci-  (20) 

Similarly,  from  the  inverse  filter  relation  C{z2)Hc{z2)  =  1,  we  can  get 


Pw^c  H-  PwH^  =  Iik2-  (2^) 

Substituting  the  above  relations  in  (12)  and  using  Kronecker  product  representation  (0),  the  error  can  be  written 
as 

e(b,c)  A  [(Iib,-Pw«c)®Pw«B  +  Pw«B®I|fei]xd  =  [((Ifc=-Pw«c)®V6)Xi  (Vc(8»Ifc,)X2]  (^)  .  where, 

(22) 

and  are  formed  with  prescribed  data,  Vf,  A  (W^B)  ( (W^B)^  (Wf[B)  )“^  and 

Ve  A  (W^C)((W,^C)^(W,^C))-i. 


V.  Simulation  Results  : 

Several  designs  were  implemented  using  the  proposed  algorithm  and  the  performances  were  compared  with 
existing  approaches.  Fig.  1-3  show  the  results  of  the  comparison.  Fig.  la,  2a  and  3a  show  results  using  the 
methods  proposed  in  [6],  [7]  and  [8],  respectively.  For  the  same  or  less  numerator /denominator  orders.  Fig.  lb,  2b 
and  3b  show  the  corresponding  results  using  the  proposed  method.  The  relative  rms  errors  [7]  for  the  results  in 
Fig.  la,  2a  and  3a  are  0.67,  0.28  and  0.77,  respectively.  The  errors  for  the  results  in  Fig.  lb,  2b  and  3b  are  0.21, 
0.26  and  0.68  respectively.  Clearly,  the  proposed  approach  found  better  match  with  lesser  number  of  coefficients, 
in  all  cases.  The  number  of  iterations  for  the  proposed  approach  were  less  than  10  in  all  cases,  whereas  the  general 
optimization  approaches  sometimes  took  close  to  hundred  or  more  iterations. 
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Section  -  3.6  :  OPTIMAL  SPATIAL-DOMAIN  DESIGN  OF  2-D  HR  FILTERS 


Summary 

In  this  Section  we  present  a  structured  matrix  approximation  framework  to  develop  the  most  general  form 
for  optimal  least-squares  (LS)  design  of  2-D  recursive  filters  from  prescribed  spatial  domain  data.  Unlike  the 
work  in  Section  3.4,  no  separability  is  assumed  for  the  2D  denominator.  Utilizing  matrix  structures  inherent 
in  this  problem  it  is  shown  that  the  exact  £2  error  has  a  purely  linear  relationship  with  the  2-D  numerator 
parameters  whereas  the  2-D  denominator  coefficients  are  nonlinearly  related  to  the  error.  But  more  interestingly, 
the  denominator  and  numerator  estimation  problems  are  theoretically  decoupled  into  separate  problems  without 
affecting  any  optimality  properties.  In  the  decoupled  form,  the  numerator  estimation  problem  is  shown  to  be 
purely  linear.  For  estimating  the  denominator  also,  it  is  shown  that  the  decoupled  £2  error  vector  possesses  a 
^ttasz-/mear  relationship  with  the  denominator  coefficients.  Decoupled  estimation  leads  to  reduced  computational 
complexity  because  there  is  no  need  for  iterating  on  the  numerators.  Simulation  results  indicate  that  for  several 
common  filer  design  problems,  the  proposed  general  version  performs  better  than  the  separable  design  developed 
earlier  in  Section  3.4. 

Introduction 

Many  2-D  filter  synthesis  algorithms  have  been  developed  by  extending  existing  algorithms  for  1-D  filter 
design.  Specifically,  Shanks  et  al  [1]  extended  the  work  of  Shanks  [5];  Cadzow  [2]  and  Shaw  and  Mersereau  [3] 
utilized  many  of  the  general  non-linear  optimization  methods;  and  Shaw  and  Mersereau  [3]  also  extended  the  work 
of  Steiglitz  and  McBride  [6],  But  these  methods  are  suhopiimal  in  the  sense  that  they  do  not  optimize  the  exact 
fitting  error  criterion.  In  contrast  to  these  approaches,  the  iterative  method  (EFM)  proposed  by  Evans  and  Fischl 
[7]  is  optimal  in  the  1-D  case  because  it  does  optimize  the  true  error  criterion.  There  have  been  some  previous 
attempts  in  generalizing  EFM  to  2-D  also  [10-13]  but,  as  shown  in  this  paper,  the  complete  error  criterion  for 
the  most  general  case  has  not  yet  been  developed  or  optimized.  Even  the  suboptimal  error  criterion  had  not  not 
optimized  w,r.t.  the  filter  coefficients  in  two  dimensions  simultaneously.  The  Evans-Fischl  method  has  been  found 
to  be  highly  accurate  for  1-D  filter  design.  In  a  recent  work,  we  have  extended  1-D  EFM  to  designing  2-D  HR 
filters  with  separable  denominators  [8],  where,  unlike  several  existing  2-D  methods  [1,3,10-13],  the  exact  fitting 
error  was  minimized  ws.t.  the  filter  coefficients  in  both  dimensions  simultaneously.  Simultaneous  optimization 
was  shown  to  be  effective  for  some  commonly  occurring  design  problems  with  symmetric  spatial  response. 

In  this  paper  we  present  a  structured  matrix  approximation  framework  to  develop  the  most  general  2-D 
version  of  EFM  for  optimal  least-squares  (LS)  design  of  2-D  recursive  filters  from  prescribed  spatial  domain 
data.  Utilizing  matrix  structures  inherent  in  this  problem  it  is  shown  that  the  exact  £2  error  has  a  purely  linear 
relationship  with  the  2-D  numerator  parameters  whereas  the  2-D  denominator  coefficients  are  nonlinearly  related 
to  the  error.  But  more  interestingly,  these  two  sets  of  parameters  appear  separately  in  the  2-D  LS  criterion. 
Hence,  using  a  theorem  on  separability  from  Numerical  Analysis  literature,  it  is  shown  that  the  numerator  and 
denominator  estimation  problems  can  be  mathematically  decoupled  without  affecting  any  optimality  properties. 
In  the  decoupled  form,  the  numerator  estimation  problem  is  shown  to  be  purely  linear.  For  estimating  the 
denominator  also,  it  is  shown  that  the  decoupled  £2  error  vector  possesses  a  quasi-linear  relationship  with  the 
denominator  coefficients.  Decoupled  estimation  leads  to  reduced  computational  complexity  because  there  is  no 
need  for  iterating  on  the  numerators.  Simulation  results  indicate  that  for  several  common  filer  design  problems,  the 
proposed  general  version  performs  better  than  the  separable  design  developed  earlier  by  the  Principal  Investigator. 
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Formulation  of  the  Problem 


Let  the  prescribed  first-quadrant  spatial  impulse  response  of  size  fci  x  ^2  he  given  by 


a:(0,0)  x(0,l) 

a;(l,0) 


*(0,^2  -  1) 
x(l,k2  -  1) 


La:(fci  — 1,0)  a:(fci-l,l)  •  ••  a:(fci  —  1,  ^2  -  1)  J 

Lot  , 

,  A{ZuZ2)  Er=oEi=0«(hJ>r^2“'  /o) 

B{Zi ,  Z2)  J2i=0  Ei=0  »(*>  ^2 

be  the  transfer  function  of  the  2-D  filter  to  be  designed  to  approximate  X  and  let  its  ki  x  k2  spatial  response  be 
ffiven  by 

[■  /i(0,0)  MO.l)  •••  h(0,A:2-l)  ■ 

h(l,0)  hil,l)  •••  h{l,k2-l) 

H  =  .  .  . 

.h(fci-l,0)  h(ki-l,l)  •••  h(fci-l,fc2-l)- 

In  order  to  develop  the  structured-matrix  representation  of  the  2-D  filter  design  problem,  define  two  matrices 
containing  the  numerator  and  denominator  coefficients  as 


a(0,0)  a(0,l) 

a(l,0)  a(l,l) 


a(0, 712) 
0(1,772) 


G  and 


.0(711,0)  0(711,1)  0(711,712). 


6(0,0)  6(0,1) 

6(1,0)  6(1,1) 


6(0,7712) 

6(1,7772) 


L6(mi,0)  6(mi,l)  6(mi,m2)J 

respectively.  In  vector  form  define,  x  =  7;ec(X),  h  =  7;ec(H),  a  =  7;ec(A)  and  b  =  vec(B)  where  vec  is  the 
operation  of  stacking  all  the  columns  of  a  matrix  one  below  the  other.  The  problem  addressed  in  this  paper  is  to 
estimate  a  and  b  by  minimizing  the  following  ^2-norm  of  the  error  between  x  and  h  i.e.. 


min  He  IP  A  ||x-h|p 

a,b  = 


Decoupling  the  Error  Criterion 
Equation  (2)  can  be  rewritten  as 


A{zi  ,Z2)  =  H (zi ,  Z2)  B{zi ,  Z2) . 


Note  that  ifci  x  k2  significant  samples  of  the  desired  spatial  response  are  to  be  matched.  Hence,  by  equating  on 
both  sides  of  (7)  the  coefficients  of  equal  powers  of  zf‘  and  z^‘  up  to  fci  -  1  and  ^2  -  1,  respectively,  equation 
(7)  can  be  expressed  using  matrix- vector  notation  as 


a]  [HH  rBi' 

...  =  ...  b=  •••  h, 

0  H2  B^ 
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where  H\  H^,  and  are  defined  in  the  Appendix.  If  the  impulse  response  H  and  the  numerator  coefficients 
b  were  known,  the  numerator  coefficients  a  can  be  calculated  from  the  top  part  of  (8)  as 

a  =  H^b  =  B^h.  (9) 

However,  in  this  case  the  exact  H  is  unknown.  Hence,  replacing  in  the  lower  half  of  (8)  by  formed  using 
the  corresponding  elements  of  X,  would  produce  the  following  ‘equation  error’, 

d(b)  A  X^  b  =  B^  X.  (10) 

From  (6),  x  =  h  +  e.  Using  this  in  (10),  we  get 

d(b)  =B^  (h  +  e) 

=B^e,  using  (8)  (11) 

Also  according  to  (8),  B  is  orthogonal  to  h.  Hence,  based  on  the  orthogonality  principle  [9]  an  inverse  relationship 
can  be  established  using  similar  steps  as  in  [7,  8] , 

e  =B(B’’B)“^B^x 
A  WB^x 
=WX2b 
A  W  g  :  G  b 

=Wg  +  WGb  (12) 

where  b  A  [1  :  b^]^,  g  is  the  first  column  of  X^  and  G  contains  the  remaining  columns  of  X^.  If  W  is  assumed 
to  be  ind^endent  of  b,  by  setting  the  gradient  of  ||e|p  in  (13)  to  zero,  the  denominator  vector  b  can  be  estimated 
as, 

b  =  -(G^W^WG)"^G^W^Wg  (13) 

But  since  W  does  depend  on  b,  the  above  equation  will  be  used  to  estimate  b^*)  iteratively,  with  W  formed  using 
estimated  at  the  previous  iteration.  A  convenient  initial  estimate  of  b  can  be  obtained  by  minimizing  the 
equation  error  in  (10)  as 

= 

Simulation  Results 

Computer  simulations  have  been  done  to  design  zero-phase  lowpass  [14],  zero-phase  bandpass  [14],  Gaussian 
and  Laplacian  filters.  It  has  been  found  that  for  same  filter  orders,  the  proposed  method  performs  better  than  the 
separable-denominator  case  [8]  for  the  lowpass  and  the  bandpass  cases,  while  for  the  Gaussian  and  the  Laplacian 
cases  the  separable-denominator  method  [8]  appears  to  perform  better.  Also,  in  all  the  examples,  the  proposed 
method  performed  better  than  the  modified  Prony’s  method  given  in  [14].  Some  plots  have  been  included  of  the 
obtained  designs. 

Appendix 
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Definitions  of  the  matrices  used  in  (8)  are  as  follows.  It  should  be  noted  that  the  matrix  structures  have 
been  given  only  for,  ”b  1  ^  ^ij  ^2  "b  f  ^  the  modifications  for  other  cases  being  obvious. 


A 


•Hj  0^  .  0- 

Hj  0  .........  0  ^  jj^(„j+i)(„,+i)x(mi+i)(m3+i)^  where,  (vl.l) 

HL  Hi  1  HJ  0  ••  0. 


H}  A 


h(0,  i) 

0 

h{l,i) 

/i(o,  0 

0 

h{ni,i) 

h{ni  -  1,0 

•••  h(0,0 

0  ••• 

0 

0 

J? 

n  ••• 

0 

Tm2+1 

m2 

7^2  +  1 

^m2-l 

jm2+l 

e  iRt***’» 

Jfc'  1 

••• 

7*2 

'^*2“m2-l-J 

g  jp.(ni+l)x(mi+l)^  and 


hijii  "b  1,  f)  *  *  *  h(0,  i)  0 

j{  A  h(mi,i)  . 

.h(h-l,i)  . 


[tifca-fm+lXnj+DlxCmj+lXmj+l) 


h(0,  i)  €  ir(*=i-”*-i)x('"i+i),  for  j  <  (122  +  1) 

h{ki  —  mi  —  l,i)J 


■  h(0,i) 

=  h{mi,i) 


0 

h{0,i) 


Bi  A 


h(mi  —  2,5 

0  ••• 

/i(o,o 

h{ki  —  2,  i 

)  ...  h{h 

—  mi  — 

Bh  0 

. 

...  0 

Bj  Bj 

0  . 

...  0 

Bi 

.  . .  Bj  0 

...  0 

fcfcix(mi+l) 


for  j  >  (n2  +  1)  (-^-5) 


g  jj^(ni+iXn3+i)x*ifc3^  where, 


B.i  A 


6(0, i)  0 
6(1,  i)  6(0,0  0 


.6(ni,0 


6(0,0  0  •••  0. 
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Dj  0 

D?  Dg 

. 

...  O' 
...  0 

A 

D;n3+i 

^m2  m2— 1 

•  •  •  Dr+‘ 

...  0 

0 

Di? 

. . . 

^0  J 

r 

6(ni  +  l,0  ••• 

6(0,0  ••• 

0 

A 


0 

6(0,0 

6(1,0 


6(0,  i) 


6(mi,0 

0 

6(0,0 


6(mi-l,0 

6(mi,i)  6(mi  — 1,0 
0 


6(0,0 


g  jj^[tifc3-(ni+i)(n3+i)]xJ:iJ;2^  where,  (yl.8) 


0 

6(0,0 


e  for  j  <  (n2  +  1) 


6(0, 0 


6(mi ,  0 


0 
0 

0 
0 

6(0,  Oj 


e  for  j  >  (ns  +  1X^.9) 
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Section  -  3.7  DISTRIBUTED  LOOK- AHEAD  :  A  GENERAL  APPROACH  FOR  PIPELINING  RECURSIVE 
Digital  Filters 

SUMMARY 

A  new  Look-Ahead  (LA)  scheme,  Distributed  Look-Ahead  (DLA),  is  proposed  for  pipelined  implementation 
of  recursive  digital  filters.  It  is  established  that  in  case  of  many  recursive  filters,  DLA  can  provide  equivalent 
and  stable  implementation  with  reduced  pipeline  delay  and  hardware  complexity,  when  compared  with  some 
existing  LA  schemes  [4,  5].  The  existing  Scattered  Look-ahead  implementation  [5]  achieves  stability  at  the  cost 
of  increased  multiplication  and  latch  complexities  and  considerable  delay  in  output  generation.  The  Clustered 
look-ahead  approach  can  not  always  guarantee  stability  [5].  This  work  shows  that,  in  order  to  attain  stability,  the 
output  samples  need  not  be  clustered  or  equally  scattered.  Indeed,  in  many  filter  design  problems,  stability  can 
be  maintained  by  using  unequally  distributed  past  output  samples.  When  compared  with  the  scattered  approach, 
the  proposed  scheme  uses  fewer  number  of  pole-zero  cancelations  and  the  introduced  roots  are  not  necessarily  at 
the  same  radii  as  the  original  filter  poles.  Hence,  the  proposed  DLA  scheme  has  reduced  multiplication  and  latch 
complexities,  higher  area-efficiency  and  it  produces  outputs  with  reduced  delays. 

1.  Introduction 

Look-ahead  pipelining  is  highly  effective  in  attaining  high  sampling  rate  and  computation  speed  for  low- 
cost  VLSI  implementation  of  digital  HR  filters  [3-5,  9,  13].  Among  existing  LA  schemes,  the  Clustered  (CLA)  or 
Time-Domain  (TD)  approach  [3,  4,  9]  generates  the  present  output  using  contiguous  past  output  samples  whereas 
the  Scattered  (SLA)  or  z-domain  (ZD)  approach  [5,  12]  uses  equally  separated  past  output  samples.  A  desirable 
feature  of  SLA  is  that  stability  is  guaranteed.  However,  this  is  achieved  at  the  cost  of  relatively  large  delay  in 
output  generation  as  well  as  increased  multiplication  and  latch  complexities  to  implement  the  numerator.  On 
the  other  hand,  CLA  may  require  filter  order  augmentation  to  maintain  stability,  which  also  comes  at  the  cost  of 
increased  hardware  complexity  [3]. 

In  this  paper,  we  propose  a  general  Look-Ahead  approach  that  opens  up  a  large  class  of  new  possibilities  to 
provide  stable  realizations  with  reduced  pipeline  delay  and  hardware  complexity  than  the  existing  LA  schemes. 
Possible  stability  regions  for  the  proposed  scheme  are  also  addressed.  This  paper  argues  that,  in  order  to  attain 
stability  in  pipelined  form,  the  output  samples  need  not  be  clustered  [3,  4,  9,  13]  or  equally  scattered  [5,  12]. 
Indeed,  for  many  filter  design  problems,  it  is  shown  that  stability  can  be  maintained  by  using  unequally  distributed 
past  output  samples.  The  proposed  scheme  is  denoted  as  Distributed  Look-Ahead  (DLA)  approach,  where  the 
number  of  denominator  coefficients  can  be  kept  same  as  that  in  the  original  filter  (as  in  SLA)  but  the  numerator  and 
denominator  orders  are  lower  than  that  in  SLA.  Hence,  the  proposed  DLA  scheme  can  offer  reduced  multiplication 
and  latch  complexities,  higher  area-efficiency  and  it  can  produce  outputs  with  reduced  pipeline  delay  than  the  SLA 
scheme.  Unlike  SLA  but  similar  to  CLA,  stability  is  not  guaranteed  with  the  proposed  scheme.  However,  simple 
stability  conditions  and  regions  can  be  derived  and  have  been  presented  for  various  pipeline  stages.  It  is  also 
demonstrated  that  the  numerator  and  denominator  polynomials  can  be  factorized  into  lower-order  polynomials, 
which  can  further  simplify  hardware  implementation  and  complexity.  Examples  are  included  to  demonstrate  the 
validity  of  the  proposed  approach. 

The  paper  is  organized  as  follows.  In  Section  2,  two  existing  LA  schemes  are  briefly  summarized.  In  Section 
3,  the  general  Look-Ahead  scheme  is  proposed  along  with  some  examples.  In  Section  4,  Stability  conditions  are 
presented  for  pipelining  of  a  2nd-order  recursive  filters  using  the  proposed  DLA  scheme  and  in  Section  5,  several 
examples  are  provided  to  demonstrate  that  stable  pipelined  implementations  with  DLA  can  be  achieved  with 
reduced  hardware  complexity. 
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2.  Existing  Look-Ahead  Schemes 

Consider  a  recursive  digital  filter  with  X-th  order  numerator  and  N-ih  order  Denominator  of  the  form, 

Hiz)  =  =  §4 

which  is  to  be  implemented  in  pipelined  form.  The  two  major  existing  Look-Ahead  forms  are  briefly  outlined 


2.1.  Clustered  Look-Ahead  (CLA)  :  M-stage  CLA  pipelining  of  the  Alter  in  (1)  would  have  the  following 
form  [3,  4,  9,  13]  : 

H^(z)  =  1  4-  biz-^  +  b2Z-^  4-  ♦  -  +  /2) 

1  +  H - i“ 

Note  that  the  denominator  coefficients  are  ordered  in  a  clustered  form.  The  numerator  (non-recursive  portion) 
can  be  implemented  by  AT  +  M  multiplications  and  the  denominator  (recursive  portion)  can  be  implemented  with 
N  multiplications  .  Thus,  the  total  multiplication  complexity  is  (2N  +  Af )  and  the  latch  complexity  is  linear  in 
M.  The  extra  delay  in  producing  output  is  M, 

2.2.  Scattered  Look-Ahead  (SLA)  :  An  equivalent  Af-stage  pipelining  of  the  same  AT-th  order  recursive  filter 
can  be  obtained  by  [5,  12], 


l-\-biz  ^-\-b2Z  - h  fcjV(M--i)+J^^ 


1  4-  CLmZ  ^  +  0>2MZ' 


•  +  0>NMZ~ 


Note  that  the  non-zero  denominator  coefficients  are  equally  ’scattered  \  The  multiplication  complexity  for  the 
non-recursive  portion  of  the  pipelined  implementation  is  (ATAf  +  1)  and  that  of  the  recursive  portion  is  N .  Thus, 
the  total  multiplication  complexity  is  {NM  +  Af  +  1)  and  the  latch  complexity  is  square  in  Af ,  The  extra  delay 
in  producing  output  is  {NM  —  A/'),  which  may  be  significant  because  of  high  order  of  the  filter  in  pipelined 
form.  However,  if  Af  is  a  power  of  2,  then  using  a  decomposition  technique  the  total  multiplication  and  latch 
complexities  can  be  further  reduced  [5] . 

3,  The  Proposed  Distributed  Look-Ahead  Pipelining  (DLA) 

In  this  new  look-ahead  scheme,  the  filter  transfer  function  is  transformed  to  have  the  form, 

„M(  \  _  _  1  +  6i2~^H - 

1  +  aMZ-^  +  +  ■  ■  ■  +  aM+k^z-(^+>‘^) 

where,  in  general,  can  be  arbitrary  integer  values  with  Jcl  =  AT,  in  order  to  keep  the  total  number 

of  denominator  (a*)  coefficients  same  as  in  the  original  denominator  in  (1).  It  is  easy  to  show  that  the  two 
existing  look-ahead  schemes  defined  in  equations  (2)  and  (3)  are  special  cases  of  this  general  Af -stage  look-ahead 
representation.  Specifically,  in  case  of  CLA,  ki  =  i  and  for  SLA,  ki  =  Miy  with  ki  =  {N  —  1)M.  Clearly,  if  the 
denominator  order  of  DLA,  (Af  +  ki,)  is  less  than  the  order  for  SLA  {NM),  the  proposed  LA  scheme  would  offer 
considerable  hardware  savings  over  SLA. 

A  comparison  of  the  multiplication  complexities  of  the  three  pipelining  schemes  are  given  in  Table  1.  Note  that  if 
Af  is  a  power  of  2,  then  using  a  decomposition  technique,  the  total  multiplication  complexity  of  the  SLA  scheme 
can  be  further  reduced  to  X  +  AT  +  AT  log2(Af). 

4.  DLA  Based  Pipelining  of  Second-Order  Filter  Blocks 
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In  this  section,  an  iterative  scheme  is  given  first  for  determining  the  coefficients  for  pipelining  second-order 
(N  =  2)  filter  blocks.  Then  several  examples  of  DLA  implementation  are  given  for  different  values  of  M. 

4.1.  Iterative  Scheme  for  Obtaining  Augmentation  Polynomial :  Consider  a  second  order  filter  transfer 
function, 


H{z)  = 


_  Bjz) 


B{z) 


A(z)  1  +  q:iz“*  -t-  a2Z 


(5) 


For  DLA-based  pipelining  of  this  second-order  filter,  the  only  choice  is,  fci  =  fci  =  2  in  (4).  To  determine  the 
DLA  pipelined  coefficients  from  these  serial  coefficients,  it  may  be  noted  that  (z)  must  equal  the  original 
H{z)  and  hence,  (z)  can  be  obtained  by  multiplying  an  augmentation  polynomial  D{z)  in  the  numerator  as 
well  as  the  denominator  of  (4),  i.e, 

-  H(z)  (6) 

where,  the  coefficients  of  D(z)  =  1  -+  diz"*  -|-  + - h  should  be  selected  such  that  the  denominator 

possesses  the  desired  DLA  form  in  (4).  It  can  be  shown  that  the  coefficients  of  the  D{z)  polynomial  can  be  found 
recursively  using  the  following  steps, 


Initialize  :  do  di  =  —ai  and  dM  = 


Iterate  :  for  i  =  2  to  M  “  1 


di  =  —aidi-i  —  a2di-2 


(7) 


4.2.  Examples  :  Let  the  complex  conjugate  poles  of  the  second  order  filter  block  be  located  at  z  =  Then 

the  transfer  function  of  the  second  order  filter  would  have  the  form, 


Hiz)  = 


1 


1  —  2r  cos  6z~^  +  r^z~^ 


(8) 


where,  the  numerator  has  been  set  to  unity  without  any  loss  of  generality.  Using  ai  =  2r  cos  0  and  02  =  in 
the  iterations  of  (7),  a  4-stage  (M=4)  DLA  pipelined  filter  can  be  shown  to  have  the  form, 

^  1  +  2r  cos  9z~^  +  r^(2cos2tf -b  l)z~^  -b  4r^  cos  g  cos  2^z~^  +  2r*  cos2^z~^ 

"  1  -  r4(2  cos  Ae  A  l)z-^  +  2r®  cos  2^z-® 

Interestingly,  this  transfer  function  can  be  further  factorized  into  a  more  convenient  decomposed  form  as, 

(l  +  2rcosdz~^  -|-r^z~^)(l  +  2r^cos2dz~^) 


(1  _  2rcos0z“^ -I- r2z“2)(l -I- 2rcos^z“^ -t- r2z“2)(H.  2r2  cos2^z  2) 


(10) 


Implementation  of  this  decomposed  form  allows  hardware  savings  over  its  SLA  counterpart.  Using  similar  steps, 
it  can  be  further  shown  that  3,  6  and  8-stage  DLA  implementations  of  the  second-order  filter  have  the  following 
decomposed  forms,  respectively, 


3  1 -I- 2rcos0z“^ -b  r^(2cos20-l- l)z  ^  +  2 c'os ^  (2  cos 2^  4-  l)z  ® 

^  "  1  -  ^(2  cos  AO  A  2  cos  26  +  5)z-3  -b  ^(2  cos  0  -b  l)z-^ 

g  _ (1  -b  2rcosgz~^  +  r^z~^)(l  +  2r^  cos26z~^  -b  r'*(4cos3  26  —  _ 

~  ^2  _  2rcos0z~^  -b  r3z“3)(l  +  2rcos0z“^  +  r^z~^)(l  -b  2r'^  cos26z~'^  +  r^(Acos^26  —  1)2^“'^) 


(11) 


(12) 
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g  (1  +  2r  cos  9z~^  +  r^2“^)(l  +2r^cos^^“^  +  r'‘(2cos4^+  l)z~^  +  r®  (4  cos  4^  cos  2^  +  1)2"®) 

=  (1  -  2rcos^2-i  ^  r2z-2)(l  +  2r  cos Bz-^  +  r2z-2)(l  +  2r^  cos 202-2  +  ^4(2  cos 46  +  1)2-4  +  ^6(4 cos iO cos 29  + 

(13) 

A  comparison  of  the  hardware  complexities  between  the  SLA  and  the  DLA  schemes  for  M  =  3, 4,  6  and  8  after 
their  respective  decompositions  is  given  in  Table  2. 

From  this  table  it  is  apparent  that  for  all  stages  of  pipelining  the  DLA  scheme  has  a  definite  edge  over  the 
SLA  scheme  as  far  as  hardware  complexity  is  concerned.  Next,  the  stability  conditions  for  the  DLA  scheme  are 
established  for  second-order  filter  blocks  for  a  several  values  of  M . 

4.3.  Stability  conditions  :  Consider  the  general  second  order  filter  block  with  complex  conjugate  poles  at 
z  =  represented  by  the  transfer  function  in  equation  (8). 

4.3.1.  M  =  4  case  :  The  4-stage  DLA  pipelined  implementation  of  the  second  order  filter  is  obtained  by  using 
the  general  iterative  scheme  discussed  above.  The  transfer  function  of  this  is  as  in  equation  (10). 

This  4:-stage  pipelined  implementation  will  be  stable  if  the  roots  of  the  factor,  (1  -h  2r^  cos  2^2^  )  are  less  than 

unity.  It  can  be  shown  that  this  would  be  true  if  6  <  0,5  cos" ^  (^).  The  region  satisfying  this  stability  condition 
is  shown  in  Fig.  1  as  the  shaded  area. 

Hence,  if  for  any  4-stage  pipelined  implementation  of  CL  A  produces  an  unstable  filter,  but  it  is  found  that  the 
condition  on  6  stated  above  or  in  Fig.  1  is  satisfied,  then  using  the  proposed  DLA  transformation  would  definitely 
be  more  appropriate  than  the  SLA  in  (3),  because  the  later  would  require  extra  hardware  for  implementing  both 
the  numerator  ajid  the  denominator.  The  exact  savings  in  hardware  for  this  case  can  be  found  in  Table  2. 

4.3.2  M  =  6  case  :  Consider  the  6-stage  implementation  of  the  2nd-order  filter  which  has  the  convenient  factored 
form  shown  in  equation  (12). 

Because  of  the  convenient  quadratic  factored  form,  the  stability  region  is  easy  derive  and  is  displayed  in  Fig.  2. 

It  is  interesting  to  note  that  these  decompositions  have  simple  power  of  2  factors  and  hence,  the  corresponding 
hardware  complexities  are  less  than  the  SLA  decomposition  scheme  given  in  [5].  The  hardware  savings  being  2 
multiplier- adder  units  and  8  latches  (refer  to  Table  2). 

Similar  stability  regions  can  be  evaluated  for  other  M  values. 


5.  EXAMPLES 


5.1.  Example  with  M  =  4  :  It  had  been  shown  in  [5]  that  the  second  order  transfer  function 


-  1  _  5/42-1  +  3/82-2 


produces  unstable  filter  with  CL  A  implementation  (refer  to  Figure  3(a)  ).  Using  the  DLA  formulae  presented 
above,  it  can  be  shown  that 


(15) 

(16) 
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Pole-zero  plots  of  the  4-stage  DLA  and  SLA  implementations  are  shown  in  Figures  3(b)  and  3(c).  The  DLA  and 
the  SLA  implementations  for  this  example  is  shown  in  Figures  4  and  5  respectively.  They  clearly  demonstrate  the 
hardware  savings  for  the  DLA  case.  Table-2  also  shows  the  gain  in  hardware  requirements  with  the  DLA  case. 

5.2.  Example  with  M  =  6  :  Consider  the  second  order  transfer  function 


H{z) 


1 

1-1.4Z-1  4- 0.52-2' 


(17) 


A  6-stage  CLA  implementation  of  equation  (17)  yields  an  unstable  pipelined  filter.  The  pole-zero  plot  of  the 
unstable  CLA  filter  is  shown  in  Figure  6(a).  Using  the  DLA  formulae  given  above,  it  is  shown  that  the  6-stage 
DLA  implementation  is  stable  (refer  to  Figure  6(b))  and  is  given  as, 


Hl{z) 


(1  -b  1.4Z-1  +  0.52-2)(1  +  0.962-2  ^  0.67162-'^) 
1-0.41322-6-1-0.18252-8 


(18) 


and  the  corresponding  SLA  implementation  is 

(1  -b  1.42-1  +  o.52-2)(1  -b  0.962-2  ^  0.67162-'*  -b  0.24002-8  -b  0.06252-8) 

1- 0.16472-6 -b  0.01562-12  i 

The  pole-zero  plot  for  the  SLA  implementation  is  shown  in  Figure  6(c)  and  the  hardware  savings  for  the  DLA 
case  over  the  SLA  case  for  this  example  is  also  given  in  Table-2. 

Hence,  in  all  these  cases,  there  is  no  need  to  use  SLA  for  stable  pipelined  realizations  and  considerable 
hardware  savings  and  reduced  pipelining  delays  can  be  achieved  if  DLA  is  used  instead. 
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Pipelining 

Methods 

Multiplication 

Complexity 

Delay 

in  producing 
First  output 

CLA 

L  +  M  +  N -1 

M 

SLA 

NM  +  L 

NM 

DLA 

M  +  kL-N  +  2L  +  1 

M  +  h 

Table  1:  Comparison  of  Hardware  Complexities  between  the  Various  Pipelining  Techniques 


Pipeline 

Stages 

Pipeline 

Method 

Number  of 
Multiplier 
/Adder 
Units 

Number 

of 

Latches 

Delay 

in 

producing 

First 

Output 

M  =  3 

SLA 

6 

10 

6 

DLA 

5 

8 

5 

M  =  4 

SLA 

6 

14 

8 

DLA^ 

5 

10  1 

6 

M  =  6 

SLA 

8 

22 

12 

DLA 

6 

14 

8 

II 

OO 

SLA 

8 

30 

16 

DLA 

7 

18 

10 

Table  2;  Comparison  of  Hardware  Complexities  for  DLA  Pipelining  for  various  M 


tieta  in  radians 


Figure  1:  Stability  Regions  for  M  =  4  Case  Figure  2:  Stability  Regions  for  M  6  Case 
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Figure  5:  Implementation  of  a  4-stage  DLA  Pipelined  recursive  Filter  Using  Decomposition  Tech¬ 
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Figure  6:  Pole- Zero  plots  for  M  =  6  case 
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Section  -  3.8  OPTIMAL  LEAST-SQUARES  DESIGN  OF  PIPELINED  RECURSIVE  FILTERS  IN  THE  TIME- 

Domain 

Summary 

Currently,  look-ahead  (LA)  pipelined  recursive  filters  are  obtained  primarily  via  transformation  of  a  given 
un-pipelined  transfer  function  [3-5,  9,  13].  For  these  approaches,  it  is  assumed  that  the  un-pipelined  transfer 
function  has  already  been  designed  as  an  intermediate  step.  In  this  Section,  we  present  a  new  algorithm  (OM- 
LA)  for  direct  and  optimal  estimation  of  the  coeificients  of  recursive  filters  in  look-ahead  pipelined  form.  OM-LA 
is  developed  by  appropriate  modification  of  a  recently  proposed  optimal  method  (OM)  for  designing  un-pipelined 
filters  [7].  It  is  demonstrated  that  the  proposed  one-step  approximation  can  achieve  superior  match  with  reduced 
pipelined  filter  order  because  it  does  not  rely  on  pole-zero  cancelations  as  in  current  LA  pipelining  approaches. 
It  is  also  shown  that  the  denominator  polynomial  can  be  constrained  to  possess  any  of  the  possible  look-ahead 
configurations.  Unlike  some  existing  methods  [1-3],  OM-LA  minimizes  the  true  time-domain  fitting  error-norm 
between  the  prescribed  and  the  estimated  impulse  response  and  produces  superior  results.  Several  examples  are 
provided  to  illustrate  the  effectiveness  of  the  proposed  approximation  algorithm. 

1.  Introduction  Look-ahead  pipelining  is  highly  effective  in  attaining  high  sampling  rate  and  computation 
speed  for  low-cost  VLSI  implementation  of  digital  HR  filters  [3-5,  9,  13],  It  may  be  noted  that  the  original  LA 
schemes  for  pipelining  recursive  filters  [3-5]  (including  DLA  proposed  in  the  previous  Section),  consist  of  two 
steps.  First,  an  un-pipelined  (or  ‘Serial’)  filter  is  assumed  to  be  available,  Le.,  the  transfer  function  of  the  filter  in 
serial  form  is  assumed  to  have  been  approximated  by  matching  some  prescribed  or  desired  specification.  The  LA 
transformations  are  then  introduced  as  a  second  step  when  the  filter  coeflScients  in  pipelined  form  are  obtained  by 
applying  either  CLA,  SLA  or  DLA  transformation  on  the  serial  filter  coefficients.  The  LA  schemes  differ  in  the 
way  order  augmentation  of  numerator  and  denominator  polynomials  is  achieved.  Mathematically,  the  inherent 
transfer  function  remains  exactly  identical  before  and  after  any  LA  transformation  is  applied.  The  higher  orders 
in  the  pipelined  cases  are  accounted  for  by  pole-zero  cancelation  which  has  no  effect  on  the  filter’s  response  or  its 
transfer  function.  In  this  part  of  the  project,  a  direct  approach  is  proposed  for  approximating  Recursive  filters 
having  desired  Look-Ahead  pipelined  forms. 

A  significant  drawback  of  the  current  two-step  approach  to  pipeline  recursive  filters  is  that  the  degrees  of 
freedom  offered  by  the  higher  orders  in  the  pipelined  filters  are  not  exploited  in  any  way.  Moreover,  for  finite 
precision  implementation  using  limited  number  of  bits,  pole-zero  cancelation  may  cause  numerical  implementation 
problems.  Hence,  a  key  motivation  for  the  later  part  of  the  paper  is  to  explore  if  the  look-ahead  recursive  filters 
are  designed  directly  in  a  single  step,  superior  approximation  can  be  achieved  at  lower  pipelined  filter  order  while 
avoiding  the  pole-zero  cancelation  problems  associated  with  the  current  two-step  design  process.  In  these  regards, 
it  may  noted  that  frequency  domain  approaches  have  been  considered  in  [1]  while  a  time-domain  approach  had 
been  taken  in  [2],  though  only  the  modified  least-squares  error  criterion  has  been  minimized.  In  this  paper,  a 
general  theoretical  framework  for  direct  and  optimal  Least-Squares  estimation  of  coefficients  of  pipelined  digital 
HR  filters  in  the  time  domain  is  presented.  The  proposed  approximation  approach  is  developed  by  appropriate 
modification  of  a  recent  work  by  the  first  author  on  optimal  time-domain  approximation  of  recursive  digital 
filters  [7] .  The  true  nonlinear  error  criterion  is  theoretically  decoupled  into  two  separate  sub-problems  of  lower 
computational  complexities.  Estimation  of  the  numerator  is  a  linear  single-step  problem  whereas  the  non-linear 
denominator  criterion  possesses  a  weighted  quadratic  form  that  is  convenient  for  iterative  optimization.  It  is 
shown  with  several  examples  that  the  proposed  approach  can  produce  pipelined  filters  with  better  match  to 
prescribed  specs  with  much  lower  filter  orders  and  without  any  pole-zero  cancelations. 
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The  paper  is  organized  as  follows.  In  Section  2,  existing  LA  schemes  are  briefly  summarized.  In  Section  3, 
the  one-step  Look-Ahead  pipelined  filter  approximation  method  is  presented  and  in  Section  4,  some  simulation 
examples  are  provided  to  illustrate  the  effectiveness  of  the  direct  approximation  approach. 

2.  Transfer  Functions  for  various  Look-Ahead  Schemes 

Consider  a  recursive  digital  filter  with  L-th  order  numerator  and  N-th  order  Denominator  of  the  form, 


(1) 


which  is  to  be  implemented  in  pipelined  form.  The  major  existing  Look-Ahead  forms  are  briefly  outlined  first. 

2.1.  Clustered  Look-Ahead  (CLA)  M-stage  CLA  pipelining  of  this  filter  would  have  the  following  form:  [3, 
4,  9,  13], 

rjMf  \ _ 1  +  biZ~^  +  b2Z~^  + - (L+M  1) _ 

1  +  aMZ-^  +  +  •  •  •  + 

2.2.  Scattered  Look-Ahead  (SLA)  ;  An  equivalent  M-stage  pipelining  of  the  same  N-th  order  recursive  filter 
can  be  obtained  by  [5,  12], 


H^{z)  = 


^ -f  62Z  ^  H - h  ^jV(M-l)+£^  iN(M  l)+x.) 


1  +  OM +  CL2MZ  H - O.NMZ 


y-NM 


(3) 


Note  that  the  non-zero  denominator  coefficients  are  equally  ^scattered’. 

2.3.  Distributed  Look-Ahead  Pipelining  (DLA)  [8,  see  the  previous  Section  also]:  In  this  new  look-ahead 
scheme,  the  filter  transfer  function  is  transformed  to  have  the  form, 

tjMr  s  _  _ 1  -b  6iz~^  -I-  •  •  •  -h 

^  '  1-1-  aMZ~^  +  -h  aAf+ib2z:-(^+*’2)  -| - 1- 


where,  ki,k2,  •  •  •,  in  general,  can  be  arbitrary  integer  values  with  ki  =  N,  in  order  to  keep  the  total  number 
of  denominator  (a,-)  coefficients  same  as  in  the  original  denominator  in  (1).  It  is  easy  to  show  that  the  two 
existing  look-ahead  schemes  defined  in  equations  (2)  and  (3)  are  special  cases  of  this  general  M-stage  look-ahead 
representation.  Specifically,  in  case  of  CLA,  fc,-  =  i  and  for  SLA,  ki  =  Mi,  with  ki  —  {N  —  1)M. 

2.4.  General  Distributed  Look-Ahead  Representation  of  Recursive  Filters  :  In  general,  any  of  the 
above  M-stage  Look-Ahead  Pipelined  transfer  functions  can  be  obtained  from: 


1  4-  anjz-^  -1-  -|-  •  •  •  -t- 

A 

=  A(z) 

=  /i(0)  4-  h{l)z-^  +  ---  +  h(K-  l)z-(^-‘>  +  •  •  • 


(5) 

(ба) 

(бб) 


where,  appropriate  choice  of  Q  and  a  set  of  ki,k2,  ••  •  ,ki,  would  lead  to  any  of  the  desired  Look-Ahead  forms 
in  (2)-(4).  Note  that  the  DLA  representations  in  (4)  and  (5)  differ  only  in  the  choice  of  the  numerator  order  Q 
which  need  not  be  restricted  for  the  approximation  algorithm.  In  fact,  by  choosing  Q  lower  than  those  required 
by  (2)-(4),  the  total  number  of  coefficients  for  the  Look-Ahead  representation  can  be  reduced,  if  desired. 

3.  Proposed  Method  for  Optimal  Estimation  of  Coefficients  of  Look-Ahead  Pipelined  Recursive 
Filters 
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The  CLA,  SLA  and  DLA  approaches  to  pipelining  are  two-step  design  processes  which  enforces  "exact 
equality”  to  an  already  existing  H{z).  Hence,  even  though  the  filter  orders  are  significantly  higher  after  any  of 
the  Look-Ahead  transformations  are  applied,  the  characteristics  of  the  filters  do  not  change.  Clearly,  the  original 
"lower  order”  H{z)  must  have  been  obtained  via  some  kind  of  approximation  approach  to  match  certain  desired 
time-domain  or  frequency-domain  specifications.  It  is  well  known  that  a  superior  fit  can  be  achieved  with  higher 
filter  orders.  However,  in  case  of  the  existing  LA  schemes  no  attempt  is  made  to  exploit  the  extra  degrees  of 
freedom  of  the  higher  order  Look-Ahead  realizations  in  order  to  achieve  superior  match  than  the  original  H{z). 
In  this  Section,  we  propose  to  use  an  optimal  least  squares  approach  [7]  to  design  the  look-ahead  recursive  filters 
directly  in  a  single  step.  The  goal  is  to  achieve  superior  match  to  the  original  specs  with  lower  filter  orders  than 
otherwise  would  be  needed  with  the  existing  two-step  procedures. 

Let, 

hd  =  [M0)  Ml)  •••  M^-i)f  (7) 

denote  the  desired  impulse  response  {IR)  of  the  pipelined  (or  un-pipelined)  filter.  Our  goal  is  to  estimate  the  ctj 
and  bi  coefficients  in  (2),  (3)  or  (4)  so  as  to  match  this  desired  IR  specification.  Since  the  general  DLA  expression 
in  equation  (5)  includes  all  the  possible  LA  representations,  we  will  outline  only  steps  to  determine  the  coefficients 
of  the  general  M-stage  DLA  representation  in  (5). 

Stacking  the  first  N  significant  IR  samples  of  H^{z)  in  (6),  define, 

h=[h(0)  h(l)  •••  h{N-lf.  (8) 


The  problem  of  estimating  the  LA  coefficients  to  match  a  given  hj  can  be  stated  as  follows. 


N—l  r  B( Z^  ’  ^ 

mtna.b  ||e||^  A  mm  ^  hd(t)  -  {^(*)} 

=  ^  L  J 

(9) 

where,5(I:)=|j;  ^ 

(10) 

e  A  ha  —  h. 

(lOo) 

a  A  [1  OAf  •••  aM+ibi]^)and 

(106) 

b  A  [1  6i  •  •  •  6q]^. 

(10c) 

Rewriting  (6)  as 


B^{z)  =  H^{z)A^{z) 


and  equating  the  coefficients  of  equal  powers  of  z  ^  on  both  sides  of  this  equation. 


■  b  ■ 

Hi' 

0 

.H2. 

where, 


(11) 


rMO)  • 

0  • 

■ 

rh(g  +  l)  • 

•  •  MO)  • 

0 

Hi  a 

6(1)  ■ 

0  • 

•  0 

,  and  H2  A 

HQ  +  2) 

..  Ml)  • 

0 

-HQ)  • 

••  h(0)  • 

••  0. 

.  hn)  • 

•  •  h{N  -  M  -ki,). 

The  matrix  J  in  (11)  is  necessary  to  constrain  some  of  the  coefficients  in  the  denominator  to  be  zero,  and  it  is 
formed  as  follows:  Starting  with  the  identity  matrix  of  size  (M  +  fcx  +  !)>  remove  all  columns  corresponding  to  the 
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indices  of  those  coefficients  of  the  denominator  which  are  zero.  The  remaining  matrix,  of  size  (M +fcL  +  l)  x  (^^+2), 
becomes  J.  From  the  bottom  partition  of  (11),  i.e., 

H2Ja=0, 

it  can  be  shown  [7]  that  the  minimization  problem  stated  in  (9)  is  theoretically  equivalent  to  first  solving  for  the 
denominator  in 

min  a^J^H2'(A^  A)"^H2Ja  (13a) 

a 

and  then  estimating  the  numerator  from  the  top  portion  of  (11),  i.e,, 

b  =  HiJa.  (136) 

Note  that  the  matrix  A  is  a  banded  Toeplitz  convolution  matrix  and  can  be  defined  similar  to  H2,  as  given  in  [7]. 
The  design  problem  has  thus  been  decoupled  into  two  separate  sub-problems  of  reduced  computational  complexity. 

Note  that  the  denominator  criterion  in  (13a)  has  a  weighted-quadratic  structure  where  the  weight  matrix  in  the 
middle  itself  depends  on  the  unknown  coefficients.  For  estimating  a,  an  iterative  algorithm  has  been  presented 
in  [7],  where  the  estimates  at  the  previous  iteration  is  used  to  form  the  middle  matrix  (A^A)“^  An  appropri¬ 
ate  modification  of  that  algorithm  can  be  used  here  to  minimize  (13a)  to  obtain  the  optimal  estimates  of  the 
denominator  a,  from  which  the  numerator  b  can  be  computed  using  (13b). 

4.  Simulations  on  One- Step  Approximation 
4.2.  Simulation  1 

In  this  case,  a  lowpass  example  has  been  considered.  In  all  figures,  the  solid  line  denotes  the  desired  response 
and  the  dashed  line  corresponds  to  the  response  using  the  proposed  approximation  approach.  Figure  1(a)  shows 
the  response  of  the  un-pipelined  filter  approximation  with  numerator  and  denominator  orders  =  3.  Note  this 
response  would  remain  identical  if  the  SLA  filter  is  obtained  from  it  using  Parhi  s  approach  [5].  The  error  is 
-9.9dB.  Figure  1(b)  shows  the  response  of  the  CLA  filter  approximation  with  Q  =  6,  M  =  4ki  =  1  and  ^2  =  2 
and  has  an  error  of  —19.5dB.  Figure  1(d)  shows  the  response  of  the  DLA  filter  with  Q  =  6,  Af  =  4,  fei  =  2  and 
jb2  =  4  with  an  error  of  -32.MB,  Finally,  Figure  1(c)  shows  the  response  of  the  SLA  filter  designed  directly  by  the 
OM-LA.  For  Figure  1(c)  —SOJdB.  It  is  evident  from  the  error  values  and  a  comparison  of  Figures  1(a)  and  1(c) 
that  the  SLA  filter  designed  directly  by  OM-LA  is  much  superior  to  that  of  Parhi  [5].  The  respective  pole-zero 
plots  of  the  filters  are  shown  alongside.  Note  that  the  OM-LA  does  not  produce  cancelling  poles  and  zeros. 

4.2.  Simulation  2 

A  notch  filter  example  has  been  considered  for  this  example.  Figure  2(a)  shows  the  response  of  the  un-pipelined 
filter  with  numerator  order,  L  =  8  and  denominator  order,  iV  =  10.  The  responses  with  the  CLA  ,  SLA  and 
DLA  approximations  are  shown  in  Figures  2(b),  (c)  and  (d)  respectively.  Note  the  filter  order  and  hardware 
requirements  for  the  SLA  filter  would  be  extremely  high  even  for  the  pipelining  stage  of  M  =  3.  Table  1  shows  a 
comparison  of  the  hardware  used  in  the  three  cases.  From  Table  1  and  the  responses  in  Figure  2,  it  is  apparent 
that  stable  CLA  or  DLA  approximations  can  be  achieved  with  excellent  match  and  much  reduced  hardware 
requirements  than  SLA. 
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Figure  1;  Lowpass  Filter:  (a)  Response  of  an  un-pipelined  filter  (b)  Response  of  the  CLA  filter 
(c)  Response  of  the  SLA  filter  obtained  directly  through  OM-LA  (d)  Response  of  the  DLA  filter, 
Legend:-  -  desired,  -  OM-LA 
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Figure  2:  Notch  Filter:  (a)  Response  of  an  un-pipelined  filter  (b)  Response  of  the  CLA  filter 
(c)  Response  of  the  SLA  filter  obtained  directly  through  OM-LA  (d)  Response  of  the  DLA  filter, 
Legend:-  -  desired,  -  OM-LA 
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Section  -  3.9  PIPELINED  LOOK-AHEAD  IMPLEMENTATION  OF  A  CLASS  OF  2-D  HR  FILTERS 
SUMMARY 

In  the  previous  section  we  have  presented  a  new  scheme  (referred  to  as  distributed  look-ahead)  which  is 
a  compromise  between  the  two  existing  look-ahead  approaches  for  high  speed  implementation  of  1-D  Recursive 
Digital  filters.  To  date  neither  the  Scattered  Look-ahead  nor  the  Distributed  scheme  has  so  far  been  utilized  for  2- 
D  HR  filter  implementation,  primarily  because  the  1-D  stability  properties  of  these  LA  schemes  do  not  necessarily 
translate  to  general  2-D  HR  filters.  The  primary  focus  of  this  paper  is  to  demonstrate  that  for  a  special  but  very 
important  class  of  2-D  HR  filters,  namely  for  Denominator  Separable  configurations,  the  benefits  of  these  stable 
look-ahead  schemes  can  indeed  be  taken  advantage  of.  The  efficiency  of  the  proposed  implementations  and  the 
reductions  in  multiplication  and  delays  are  demonstrated  with  some  examples. 

I.  Introduction  : 

Two-dimensional  (2-D)  HR  filters  have  many  practical  applications,  such  as  in  radar,  digital  image  processing, 
remote  sensing,  etc.  Processing  time  and  throughput  delay  are  two  of  the  major  problems  for  implementing 
2-D  digital  HR  filters.  Look-Ahead  (LA)  pipelining  has  been  found  to  be  highly  effective  for  attaining  high 
sampling  rate  and  high  computation  speed  for  low-cost  VLSI  implementation  of  recursive  digital  filters  [1-6] .  In 
particular,  the  Clustered  Look-Ahead  (CLA)  scheme  has  been  utilized  for  implementing  both  1-D  and  2-D  IIR 
filters  [2].  However,  it  is  known  that  even  for  the  1-D  case  CLA  can  not  assure  stability  [2],  In  order  to  avoid 
the  stability  problems  of  CLA,  several  other  LA  schemes  have  been  proposed,  namely.  Scattered  Look-Ahead 
(SLA)  [2],  Minimum  Augmentation  CLA  (MACLA)  [3]  and  Distributed  Look-Ahead  (DLA)  [9].  To  the  best 
of  our  knowledge  these  later  schemes  have  not  so  far  been  utilized  for  2-D  IIR  filter  implementation,  primarily 
because  the  1-D  stability  properties  of  these  LA  schemes  do  no  necessarily  translate  to  general  2-D  IIR  filters. 
The  primary  focus  of  this  paper  is  to  demonstrate  that  for  a  special  but  very  important  class  of  2-D  IIR  filters, 
namely  for  Denominator  Separable  configurations,  the  benefits  of  these  stable  look-ahead  schemes  can  indeed  be 
taken  advantage  of. 

Separable-Denominator  2-D  IIR  filters  have  considerable  practical  applications.  Firstly,  many  commonly  used 
2-D  filters  such  as,  Gaussian,  Laplacian-Gaussian,  Lowpass,  Bandpass,  are  known  to  possess  symmetric  spatial 
response  and  hence,  many  of  these  filters  inherently  conform  to  denominator-separable  transfer  functions  [7,  8, 
11-13].  Second,  a  general  2-D  filter  can  be  approximated  by  a  denominator-separable  filter  [12,  13].  Thirdly,  the 
design  of  2-D  separable-denominator  filters  are  much  easier  and  each  of  the  1-D  denominators  can  be  implemented 
using  highly  modular  structures  [11].  But  most  importantly,  the  stability  tests  for  these  filters  are  simpler  and 
identical  to  those  for  1-D  filters. 

Direct  form  realizations  of  2-D  denominator  separable  IIR  filters  have  been  attempted  [11,  13],  but  have 
certain  speed  disadvantages.  Block  filtering  techniques  [9]  with  a  combination  of  scattered  look-ahead  and  de¬ 
composition  based  pipelining  [2]  can  be  used  as  an  approach  for  implementation  of  2-D  denominator  separable 
filters;  this  approach  being  carried  out  on  each  of  the  two  separated  domains  of  the  denominator  of  the  2-D 
denominator  separable  filter.  However  the  state  update  in  the  Block  filters  is  based  on  clustered  look-ahead  [9] 
approach  which  does  not  necessarily  guarantee  stability  and  at  the  same  time  block  structures  are  highly  complex. 

In  this  paper,  we  show  that,  utilizing  the  SLA  and  DLA  pipelining  techniques,  high-speed  modular  imple¬ 
mentation  of  separable-denominator  stable  2-D  IIR  filters  is  indeed  feasible.  It  may  be  noted  that  the  various 
stable  LA  schemes  recast  the  way  the  the  output  is  generated  by  appropriate  placement  of  the  1-D  poles.  The 
numerator  does  not  play  any  role  in  stability  considerations.  Hence,  if  the  original  1-D  factors  are  stable  to  begin 
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with,  application  of  any  of  the  stable  LA  schemes  to  individual  1-D  factors  would  also  maintain  stability  for  the 
overall  2-D  filter.  Block  Processing  for  further  speed-up  is  also  fe^lsible  for  the  proposed  architectures. 

II.  Look-Ahead  pipelining  for  2-D  Denominator-Separable  HR  filters 
A  denominator  separable  2-D  HR  filter  transfer  function  is  given  by 

„  , _ A(zuz,)  (1, 

D(zi,zz)  B{zi)C{z,) 


II.1.  Clustered  Look-Ahead  Pipelining  [1,  2,  6]: 

Jlf -stage  pipelining  of  an  (JVi ,  jV2)-th  order  separable-denominator  2-D  HR  filter  can  be  represented  as, 


H{zi,Z2)  = 


Ei^o  ES“o 

_ A{zi ,  Z2)(l  d*  di.pZi  ^  +  do^i^2  ^  ^  d~  •  •  •  +  ^  ^ ^ 

(1  +  bMZ^^  +  +  •  •  •  +  +  CM+1^2  +  •  •  •  +  Cm+JVj-1^2 


UN2-I 


where,  rf,  j’s  denote  the  coefficients  of  the  extra  numerator  polynomial  introduced  due  to  the  pipelining  of  the 
denominator  [2,6].  The  multiplication  complexity  is  {2Ni-\-M){2N2  +  M)  and  the  latch  complexity  is  linear  in  M . 
The  extra  delay  in  producing  output  is  M  for  each  domain.  It  may  be  noted  that  this  scheme  may  suffer  from  the 
same  stability  problems  of  its  1-D  counterparts  [2].  A  Clustered  approach  that  guarantees  stability  with  minimum 
augmentation  of  order  (MACLA)  may  be  utilized  [3].  However,  finding  the  coefficients  for  MACLA  appears  to 
be  somewhat  cumbersome.  Hence,  in  our  examples  we  will  use  SLA  and  DLA  which  are  briefly  outlined  next. 


II.2.  Scattered  Look-Ahead  Pipelining  [2]: 

An  equivalent  M-stage  pipelining  of  the  same  {Ni ,  A'2)-th  order  recursive  filter  can  be  obtained  by, 


A{zi ,  Z2)(l  +  di^pz^  ^  -I-  dp^iz^  ^  4-  ^2  - ^ 


) 


(3) 


H(Zi,Z2)  ^  bMZi  ^  -I-  62M^i  H - 1-  bffiMZi  -f-  Cm^2  ^  +  '^2Af^2  + - h  CN2MZ2 

The  total  multiplication  complexity  is  (AiM-f-M-|-l)(A2Af-f-M-l-l)  and  the  latch  complexity  is  square  in  M  in  each 
domain.  The  extra  delay  in  producing  output  is  fVi(M— l)-hAr2(M— 1).  However,  if  M  is  a  power  of  2,  then  using  a 
decomposition  technique  [2], the  total  multiplications  can  be  reduced  to  {2Ni-\-Ni  log2  M+l)(2i\r2+A^2  log2  Af+1). 

11.3.  Distributed  Look-Ahead  Pipelining  [9]  : 

For  this  recently  proposed  look-ahead  scheme,  the  filter  transfer  function  (2a)  is  transformed  to  the  form 

_ A{zi,Z2){l  +  H - h  dM+kL,M+kL^~^’^^’’‘'^^2  _ 

(1  +  bMZi'^  +  bM+kiZi^^'^'°'^^  + - 1-  -I-  CmZ2^  +  CM+kiZ2  H  \-  CM+kLZ2  ^ 

(4) 

where,  ki,k2,---  are  integers.  It  is  easy  to  show  that  the  look-ahead  schemes  in  (2)  and  (3)  are  special  cases  of  this 
general  AT-stage  look-ahead  approach.  In  [9],  the  stability  conditions  for  a  few  low-ikf  cases  have  been  presented. 

The  examples  below  will  demonstrate  that,  when  compared  with  SLA,  this  new  scheme  can  produce  stable 
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implementations  with  lower  multiplication  and  latch  complexities  and  reduced  output  delay.  In  our  examples, 
kL  =  Ni  =  N2  =  N  will  be  used.  Clearly,  if  (M  +  ki)  <  NM,  this  scheme  would  provide  considerable  savings 
over  the  scattered  approach. 

III.  Examples  of  Look-Ahead  Pipelined  Implementation  of  2-D  Denominator-Separable  HR  Filters: 

In  this  Section,  we  show  that  for  2-D  Denominator-Separable  HR  filters,  any  of  the  pipelining  schemes  dis¬ 
cussed  above  can  be  easily  adopted  for  high-speed  implementation.  Specifically,  denominator-separable  designs 
of  many  commonly  occurring  2-D  filters  such  as,  Gaussian,  Laplacian,  Lowpass  and  Bandpass  provide  excellent 
match  [7].  We  provide  two  examples  using  Gaussian  and  Laplacian  filters  to  show  that,  with  separable  denomina¬ 
tors  in  two  domains,  the  recursive  sections  can  be  easily  transformed  to  pipelinable  forms  using  any  of  the  forms 
in  (2),  (3)  or  (4). 


III.1.  Example  1  :  Gaussian  Filter  Implementation 


Consider  a  4-stage  (M  =  4)  implementation  of  a  (4,4)  order  2-D  Gaussian  HR  filter  with  the  following 
coefficients. 


A(^i ,  22)  = 


-  0.00936980949545 
-0.0012672373559 
0.00825275276913 
.  0.00489009121014 


-0.00126723735355 

0.000171389878746 

-0.00111615893404 

-0.00066136950242 


0.00825275276870 

-0.0011161589361 

0.00726887011928 

0.00430710077865 


0.00489009121304- 

-0.0006613695040 

0.00430710078143 

0.00255213215053. 


(5) 


B(zi)  =  l-2.2mzi^  +  2M46zi^-0.9754zi^  +  0.im5zi*  (6a) 

=  (1  -  0.9648627^  +  0.45572f  2)(1  -  1.254621"^  -I-  0.41834^1-^)  (66) 

A  Bi  (21)53(^1)  (6c) 


In  polar  form  (2^  =  r.e^-’®*),  B(zi)  has  four  roots  (poles)  with  radii,  ri  =  0.6758  and  r2  =  0.6467  and  angles, 
$1  =  ±44.38  and  $2  =  ±14.18,  respectively.  Because  of  symmetry,  C(22)  has  identical  coefficients  as  B(2i). 
Hence,  the  poles  of  C(22)  are  identical  to  those  of  B(zi).  Each  second  order  of  B(zi)  and  <^(22)  are  pipelined 
separately.  Incidentally,  the  clustered  approach  produced  unstable  filter  in  this  case  (refer  to  the  pole-zero  plot  in 
Fig.  1).  Hence,  the  clustered  implementation  given  in  [6]  will  not  be  suitable  for  pipelining  this  particular  filter. 
The  scattered  approach  can  certainly  be  used,  but  the  coefficients  in  (6)  also  satisfies  the  stability  conditions 
given  in  [9].  Hence,  the  recently  proposed  distributed  look-ahead  scheme  [9]  would  provide  stable  filters  with 
considerable  hardware  savings.  Next  we  provide  the  coefficients  for  equivalent  pipelined  filter  implementations 
using  both  SLA  and  DLA  schemes. 


Scattered  Look-Ahead  Implementation : 

It  can  be  shown  that  4-stage  pipelining  of  the  second-order  factors  in  (6b)  have  the  following  forms, 

1  _  (1  ±  0.01952^-^  ±  0.20772i~^)(1  ±  0.96492r^  ±  0.4557zr^) 

Ri(2i)  "  l±0.41502r^±0.043l2r* 

1  _  (1  ±  0.73742i~^  ±  0.17502r^)(l  ±  1.2U6z^^  +  0.41832f^) 

B2{zi)  ~  l-0.19382f^±0.03062f® 


Pipelining  of  Ci(22)  and  €2(^2)  would  also  produce  identical  coefficients. 
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Distributed  Look-Ahead  Implementation  : 


In  this  case,  4-stage  pipelining  of  the  second-order  factors  in  (6b)  would  have  the  following  forms, 


1  _  (1  +  0.9649zf ^  +  0.4557zr^)(l  +  O.OlQS^r^) 

Bi(zi)  “  l  +  0.20732i"^  +  0.00402j'® 

1  _  (1  +  1.25462,-^  +  0.41832r^)(l  +  0.73742i~^) 

B2{zi)  ~  1-0.368821-^  +  0.129121"® 


Because  of  symmetry,  Ci(z2)  and  C2iz2)  will  also  have  identical  coefBcients. 

It  may  be  emphasized  here  that  both  SLA  and  DLA  implementations  in  (7)  and  (8),  respectively,  produce 
stable  high  speed  structures.  However,  comparing  the  denominators  as  well  as  the  numerator  factors  in  (7)  and 
(8),  it  easy  to  see  that  the  recently  proposed  DLA  scheme  provides  considerable  savings  in  multiplication  and 
latch  complexities  and  reduced  delay  in  output  generation.  The  pole  locations  with  4-stage  lookahead  for  this 
example  are  shown  in  Fig.  2.  In  Figs.  3  and  4,  the  signal  flow  diagrams  are  given  for  a  pair  of  2nd  order  blocks 
in  two  domains  for  4-stage  scattered  and  distributed  look-ahead  pipelining,  respectively.  Comparing  the  number 
of  delays  and  the  multipliers  in  Figs.  3  and  4  also  it  is  obvious  that  DLA  can  offer  reduced  complexity  than  SLA. 


III.2.  Example  2  :  Laplacian  Filter  Implementation 


In  this  case,'  a  (4,4)  order  2-D  Laplacian  HR  filter  with  the  following  coefficients  are  considered. 


A(2i,22) 


■-'0.00234823839532 

-0.00373345042666 

-0.02194869729514 

0.00165842866327 

.-0.02470285791456 


-0.00391545411710 

0.00476642973501 

-0.02059991460608 

0.01338997088113 

-0.02800618927861 


-0.02106505604166 

-0.01976926214265 

-0.04640872036691 

0.1370918579269 

-0.09758570689236 


-0.00234522754996 

0.01122239739174 

0.14204249387362 

0.1639249127551 

0.11045024575180 


B(2i)  =  1  -  1.641802j'‘  +  1.500632f  2  _  0.801332f  ®  +  0.2186621"^ 
=  (1  -  0.485221"'^  +  0.51462f  ^)(1  -  1.15662f  ^  +  0.424921"^) 
A  Si(2i)S2(^i) 


-0.0230375942386 

-0.0264271437586 

-0.1034676446472 

0.09669575185083 

-0.1501644414553 

(9) 

(10a) 

(106) 

(10c) 


The  radii  and  the  angles  of  the  roots  of  B{zi)  are,  ri  =  0.71736,  r2  —  0.65185  and  0i  —  ±70.23  and  62  —  ±27.48, 
respectively.  The  poles  of  (7(22)  also  have  the  same  values.  Similar  to  Example  1,  each  second  order  of  B(2i)  and 
0(22)  are  pipelined  separately.  The  clustered  approach  of  [6]  again  produced  unstable  filter  in  this  case  (due  to 
B2{zi)).  However,  in  this  case  also,  the  coefficients  in  (10b)  satisfied  the  stability  conditions  of  DLA  [9].  Hence, 
both  the  scattered  and  distributed  look-ahead  pipelining  methods  can  provide  stable  filters  with  DLA  providing 
more  hardware  savings  than  SLA.  The  coefficients  for  scattered  and  distributed  pipelining  implementations  for 
this  example  are  given  next. 


Scattered  Look-Ahead  Implementation  ; 


It  can  be  shown  that  4-stage  pipelining  of  the  second-order  factors  in  (10b)  have  the  following  forms. 


1  _  (1  -  0.79382f ^  +  0.26482f ^)(1  +  0.48522r^  +  O.fiHfizi"^) 

Hi  (21)  “  1  -  O.lOOSzj-^  +  0.070l2j-® 

1  _  (1  +  0.48792,-^  +  0.18052r^)(l  +  1.15662r^  +  0.42492i-^) 

B2izi)  ~  1 +  0.123121" V0.03262j-® 


(11a) 

(116) 
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Ci{z2)  and  C2{z2)  would  also  have  identical  forms. 

Distributed  Look-Ahead  Implementation  : 

looktwT  factors  in  (10b)  of  the  UplacUn  fillor  when  pipelined  by  distoibnted 

look-ahead  pipelining  leads  to  the  following,  uistnouiea 


1 

(1  -1-  0.48522:fi  -f  0.51462rr^)(l  +  O.TOSSzr^l 

1 

1  -  0.3653zi-'‘  -  0.2102^f® 

(1  +  1.15662rf  1  -f  0.42492:r^)(l  +  0.4879zf^) 

(12a) 

^2(21) 

1  -  0.05752:[-^  +  0.0881zf  ® 

(126) 

have  identical  forms.  The  pole  locations  for  the  4-stage  look-ahead  schemes  for  this 
placian  example  are  given  in  Fig.  5  which  clearly  shows  that  both  SLA  and  DLA  provide  stable  imnlementa 

.T„dn!erd.“v  'r"" 

will  be  incl  d  d  ^  th  ^  ^  ^  implementation  of  high-speed  separable-denominator  2-D  HR  filters.  More  example 
included  in  the  paper  incorporating  block  processing  [10]  for  further  improvement  in  throughput. 


t: 

co 

Q. 

(0 

.c 

'cD 

(0 

E  . 


X 

X — q - 

1 

-  -  - ,-x-  -  ^  - 

0 

“  — .  X .  -1  <x  - . 

'  1  t 

1 

V  *  ' 

-1 

'  ^  '  X 

'  T  “ 

— _ ol _ 

- _ Q _ _ 

-1  0  1 

Real  part 


-10  12 

Real  part 


^  ^distributed  0  1 

Real  part  Real  part 

FIG  2:  Pole  -  wro  plots  for  the  two  second-order  blocks  using  4.suges  of  scattered  and  ^ 

pipelining  methods  for  the  Gaussian  2-D  denoounaior  separable  filter.  * 
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RG  3:  lniplcment2Uon  of  the  second-onJcr  blocJcs  in  both  domains  after  4-$tagcs  of  scattered  pipelining  FIG  4:  Implementation  of  the  sccond-ord^  blocks  tn  both  domains  after  4.stagc$  of  disinbuicd  pipelining 

with  decomposition  technique  inside  the  recursive  loop.  with  decomposition  technique  inside  the  recursive  oop. 


Real  part  Real  part 


HG  5-  Pole  -eero  plots  foe  the  two  second-txder  blocks  using  4-sugcs  of  jcirtered  and  distributed 

pipelining  methods  for  the  Uplactan  2-0  demwinaior  separable  filter. 
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